wo 03/042661 



CCATCCACCT CTCTGGACAG TAKTBOTGCA GCftTTGCCTO ATQGAftATAT AOCTGAQAaC 4500 

TTAaAGTCTT TACTTTJU^Ca CAGAAA6GGO TGGOOGOACT CACATCTGAS CATTGTTTTC 4560 

CTCTTCCTAA AATTAGGCA6 OAAAATCAGT CTAOTICTGT lATCTGTTGA TTTCCCATCA 4€20 

CCTGACABTA ACTTTCATOA CATAGGATTC TGCOGCCAAA TTTATATCAT TAACAATGTG 4 6Ba 

TGCCTTTTTG CAAfaACTTGT AATTTACTTA TTATGTTTGA ACIAAAATSA TTGAATTTTA 4740 

CAGTATTTCT AAGAATGQAA TTGTGGTATT TTTTTCTGTA TTQATTTIAA CAQAAAAITT 4800 

CAATTTATAG AQGTTAGGAA TTCCAAACTA CAGAAAATGT TTGTTTTTAS IGTCAAATTT AB€Q 

TTAGCZGTAT TTOTAGCAAT TATCZAGGTTT GCTAGAAATA TAACTTTTAA TACAGTAGCC 4920 

TGTAAATAAA ACACTCTTCC ATATGATATT CRACATTTTA CAACTGCAGT ATTCACCTAA 4980 

AGTAGAAATA ATCTGTTACT TATTGTAAAT ACTGCCCTAG TGTCTCCAT6 GAOCAAATTT S040 

ATATTTATAA TTGTAGATTT TTATATTTTA CTACTGftGTC AAOTTTTCCA OTTCTGICTA 5100 

ATTGTTEAGT TTAATGACGT AGTTCATTAG CTQGTCrTAC TCTACCAGTT TTCTGACATT 5160 

OTATTOTGTT ACCTAAGTCA TTAACTTTGT TTCAGCATXST AATTTTAACT TTTGTOQAAA S220 

ATAGAAATAC CTTCATTTTG AAAGAAOTTT TXATGAQAAT AACACCTTAC CAAACATTGT 5280 

TCAAATGSTT TTTATCCAAQ GAAITGCAAA AATAAATATA AATATTGCXa TTAAAAAAAA 5340 
AAAAAAAAlUt AAAAAAAAAA AAAAAAA 



Seq I'D NO; 374 Protein sequences 

Protein Accesslazi 41 s built from XP_031379 

1 11 21 31 41 51 

1 ) I 1 ) i 

MRILICRFIiAC iQLLCVCRIiD WAM(?r!rRQQR KLVEEIC3WSY TGALKQKMWQ KKYPTCSTSPK €0 

QSPIKIDSDL TQVNVNLKKI* SK^GMDKTSIj EHTFIENTGK TVBIHLTMDY RVSGOVSBMV 120 

FKASKITFEnr GKCNMS8DGS EHSIjEGQKPP I^BMQIYCFDA DRFS8FEEAV 1QBKGKLRAI>3 ISO 

IX>F£VGrEEH tDPKAlIDGV ESVSRFGKOA JLLDPFIUJH* IiPNfiTDKYYI yHGSiyESPPC 240 

TDTVOniVFK mVSISBSQIf AVFCEVLTKQ QSOYVMLMDY lQNHFE.BQQ3r KFSRQVPSSY 300 

TTCKBEIHBAV CSSBFSKVQA DPEHYTSLLV TWERPRWYD TMIEKFAVI^Y QQLDGSDQTK 360 

EEFUTDQY(2D liGAIUTNIiLF NMSYVLQIVA ICTHGLYGKy SDQUTVmSBT DHPE Uai iFPB 420 

LIGTEEIIKE EEEGKDIEBG AJVNPQRDSA TNQIRKKEPQ ISTTTEmiRI G^niBKKTH 430 

lElSPTROSBFS GKGDVPNTSL MSTGQPVTKIi ATEKDISLT9 QTVTELPPHT VEGTSASUJD 540 

G&KXVLRSPH MHIiSGTABeL NTVSITEYEH BSLLTSFKIiD TGA&D888SS PATSAIPPIS 600 

BSHSOGCrFS EKtSFETITVD VIilPBSARUA SBDSTfiSGSB BSIJCDPSMBG NVWFPS3TDI 660 

TAQPDVGSGR E5FLQTMYTE IRVDESEKTr KSFSAGPVMS QGPSVTDI*EM E HYSTg AYFP 720 

TEVTFEAPTF SSRQGDUVaT VNWYSQTTO PVYIUVEASNS SEBSRlGLAS GLESWCAVI 760 

PLVIVSALTF ICLVVI»VGII> lyWRKCPQTA HPYIiBDSTSP SVI6TPPTPI FPlfiDDVGAI 840 

PIKHFPKHVA DLHASSGFTE EFKTXiKBFrQ BVQSCTVDLG ITADSSlIfiPD HKHKNRYIHl 900 

VAYDaSBCWJf AQIAEKDGKL TEVIWAHYVD GYNRPKAYIA AQGPLKSTAB UFHRMIWEHH 960 

VEVrVHITNL VEHGRRKCDQ YnsADGSEEY QNPIiVTQKSV CJVXiAYYTVMI FTLSUtTKIKK 1020 

GSQSCGSPSGS WTQYHYrrQW POMOVP^SL PVLTFVBKAA YAKRHKVGPV WHCSR SVCTt 1080 

TOTYIVIiDSM LQQIQHHGTV NTI^PLKHIR SQENYIiVQTE EQWPIHDTI. VEAILSKBTE 1140 

VLDSHIHAYV »AI«LIP<3PAG KrKDEKQPQli L8QSNIQQSD YSAALKQCKR EiCNRTSSIIP 1200 

VE&9RVGI8S IiSGBGrEnrai ASYrMGYYQS MEPIITOHPL LHTIKDFWBM IWDHNAQBW 1260 

MIPDGQMMAS DEPVYHPHKD BPINCETSFKV TIiMABERKCI> SNEEIOjXIQD F IIiEA TXaDDY 1320 

VIiEWHHFQCP KWPlSfPDSPIS KTPEIiISVIK EEAAHRDSPrt IVBDEHGSVT ACXFCftLTTb 1380 

KHQUSKESSV JDWOVAKMIH ZMRPGVFADl &QYQFLYKV7 IiSZiVBtSQEB IIFSTSUOSEIG 1440 
AAUEDGfiHAS SLESIiV 

8eg ID NO: 375 DMA eegnence 

Nucleic Acda AcceesioxL EQ9 fiequence 

Coding sequences 148-4494 

1 11 21 31 41 51 

I I 1 I I I 

CACACATAOG CA03CAC6RT CTCACTTTOGA TCTATACACT GSnGQATTAA AACSIAACAAA 60 

CAAAAAAAAC ATTTCCTTOG CTOCCCXTTOC CTCTOCACTC TGAGAAGCAG AG6AGCCGCA 120 

CX3GCGAGGGG CCGCRGACOQ TCTGGAAATG CQAATCCTAA AQCGTTTGCT 03CTTGCATT IBO 

CAGCTCC3dT GTGTTTGCCG CCrQGATTGG GCTAATQaRT ACTACASACA ACAGAGAAAA 240 

CrTGTTGAAO AGATTQGCaXS GICCTATACA GSftGCACTQA ATCAAAAARA TraQGORRAG 300 

AAATATCCAA CATGTAATAG CCCftAAACAA TCTCCTATCA ATATTOAIlSk. AGA TCTT ACA 360 

CAAGTAAATQ TGAATCTTAA GAAACTTAAA. TTTGRGOtSTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TEGGGAAAACA GTGGAAATTA ATCTCACTRA TGACTAIX3QT 480 

GTCRGCGQAS GAGTTTCAGA AATGGTGTTT AAAGCAAQCA AGATAACTTT TCACTGGGGA 540 

ARATGCAATA TOTCATCTGR TGGATCAGAC? CA1AGTTTAI3 AA0GACAAAA ATTTCCACTT 600 

ORGATGCRAA TCTACTGCTT TCSWTGCftGAC OSATTTTCM GTTTTGAISGA AGCACTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATOCATT rrrGTTTGAGG TTGOGAiaGA A6AAAATTX6 720 

GATTTCAAAG OGATTATTGA TGf3R3STOGAA ABTOTTAGTC GTTTTGOQAA GCAGGCTGCT 780 

TTAGATCCAT TCATACKST^ GAACX:TZX:TG CCAAACTC3«^ CXGACAAGTA TTACATTTAC B40 

AATGGCTCAT TGACATCTCC TCCCTOCAOi. GACACA6TTO ACrCGGATTGK TTTTAAAGAT 900 

ACAGITAGCA TCTCTGAAAS CCAGTTGSCr UrrWrrJC I G AAGTTCTTftC AAXGCAACAA 960 

TCTOGTTATa TCATOCTGAT GGACTACTTA CAAAACAATT TTOGAGAGCA ACA<3 TACftft G 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AQATTCRTGA AGCAtSirrTOT lOBO 

AGTXCAGAAC CAQAAAATGT TCftOGClGAC GCAGAiGAATT ATACCAGCCT TCTTSXTACA 1140 

TGGGAAAGAC CTOGAGTCOT TTATGATACC ATGATTGAGA AOTTTGCAST TTTGTA CCAO 1200 

Cn£3i:TGGATG QAOAGGACCA AACCAAGCAT SATTTTTOA CAlGATGGCIA TCAAGACXT6 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAOrTATG TTCTTCRGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATAOrGG AAAATACAGC GACCAACTQA TTGTOGACAT GCCTACTGAX 1380 

AATCCTGAAC TTGATCim COCTGAATTA ATTGGAACTG AAGRAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAOACATTGA AGAAGGOGCI ATT<?FQAA7:C CTGOTACaAGA CAGTGCTACA 1500 
AAOCAAATGA OGAAAAAGGA AOCCCAGATT TCTAGCACAA CACACTACAA TCOCATAGOG 1S60 
ACQAAAIACA A1X3AAGCCAA 6ACXAAC06A TCGOCAACAA GAGQAAC3T8A ASTCTCraGA 1620 
AAGGGTGAXG TTCCCAAXAC ATCTTTftAAT TCCACITCCC AAOCAdTCAC TAAAT MGOC 1680 
ACAGAAAAAO ATATTTCCTT GACTTCTCAG ACTGTGACIG AACTGOCAOC TCACACTGTG 1740 
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GAAGCn'ACTT CftQCCTCTTT AAATGaVTOaC TCTAAAACtG TTCXTAQATC TCCACATATG IflOO 
AACrXGTOGQ GGACKSCAOA ATGCTTAAAT ACRGTTTCTA TAACfiiSftATA TGAQGAGGAG I860 
AGTTTATTTGA CCAGTTTCAA GCTTQATACT GGAGCTGAAQ ATTCTTCAGG CTCCAGTCCC 1920 
GGAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCX: AW3GGTATAT ATTTTCCTCC 1980 
GAAAAOCmG AGACAATAAC ATATC3ATGTC CTTATACCAQ AATCTGCTA6 AAAT<K!TTCX: 204 0 
GAAIGA^CTCAA CTTCATCftGG O^TCAGAAGAA TCACXAAA8Q ATCXnTTCTAT GGAtSGQAAAT 2100 
GTOTCGTTTC CTAfiCTCIAC AGACATAACA GCACAGCCX:G ATCTTO3ATC AGGCAGAGAG 2160 
AGCTTTCTCC AGACTAATTA CACTC5AGATA OGTGTTOaTG JVATCTGflSAA GACAACCAAQ 2220 
TOCTTTTCTG CAaGCCC3«?r GATGTCACAQ GGTCCCTCftG TTACAGATCT GGAAATGCCA 22 BO 
CATTATTCTA CCTTTSCCTA CTTCCCAACT GAGGTAACAC CTCATaCTTT TACCCCATCC 2340 
OrCCAGACAAC AGOATTTGGT CTCCAOGOTC AACGTGSTAT ACTCGCAGAC AACXXAAOOG 2400 
GTATACMTQ AGGCCA6TAA TAOTAGCCAT GA6TCTOQTA TTQGTCTAQC IXSAGGGGTTG 24fiO 
GAATCOGAGA AGAAGaCRflT TATACCCCTT GTtSATOGTGT CAGCCCTGAC TTTTATCTGT 2520 
CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGftt3C3AAAT GCTTCCAGAC TGCACACTTT 25BD 
TRCTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACAOC TATCTTTO^ 2640 
ATTTCAOATS ATOXGGGAGC AATTCCAATA AAGCACTTTC CAAASCATOT 'XGCAGATTTA 2700 
CATGCAftGTA GTCaOTTTAC TGAAGAATTT GAGGAAGTGC 3WSAQCT6IAC TGTTQACrrEA 2760 
GGTATTACAG CAGACAGCTC CAACCAOCCA GACRACAASC ACAAGAATCG ATACATAAAT 2820 
ATCGTTGCCT ATGATCATAG CAGGGTTAAQ CTAGCACAQC TTQCTGAAAA GGATQGCftAA 2880 
CTGACTGATT ATATCAATGC CJVAITATGTT GATGGCTACa AC»GACCAAA ASCTTATATT 2940 
GCXGCOCAAG GCCCS^GAA ATCCACAdSCT GAAGATTTCT GGAQAATGAT ATGGQAACAT 3000 
AATGTC3GAA6 TTATTGTCAT GATAACSiAAC CTCOTGQZUSA AAGGAAQOftS AA2\AT6TGAT 3060 
CAGTACTGGC CTGCCQATEG GAQTGAGGAG TAC36GGAACT TTCTGGTCAC ■rcaOAAQAGT 3120 
GTGCAAQTGC TTGCCTATTA TACTGTX3AGG AATTTTACTC TAAGAAACAC AhAAATAAAA 3130 
AAGGGCTCGC ASAAAQGAAG ACOCSOTaSA GGTCT66TCA CACa«3TATCA CTACACGCftG 3240 
TGGCCTQACA TGGGAG7AOC ABAGTACICC CTGCC2U&TGC TGACCTTTBI GAGftAftCGCA 3300 
GCCTATCCCA AcaCGOCATGC AGTOGGSCCI 6TTGT03TCC ACTGCAGITGC TQGAGTTGGA 5360 
AGAACAQGCA CATATATTGT GCTASACAGIT ATGrTGCftGC AGATTCAACA OGAAGGAACT 3420 
GTCAACATAT TTGGCTTCTT PAAACRCKTC OGTTCACRAA GAAATTATTT GGTACAAACT 34 BO 
GAlSSflGCAAT ATGTCXTCAT TCATGATACA CTGGTTGAGG CCATACTTRG TAAAaAAACT 3540 
GaSGTOCTGG ACAGTCATlUr TCATGCCTAT GTTAATGC3U: TOdCATTCC TGGACCAGCA 3fiO0 
GGCS^AAACAA AGCTABAGAA ACAATTCCAO CTCCTGAGOC AOICAAATAT ACAGCRfiAfiT 3660 
GACTATTCTG C:AGCX:CTAAA GCAATGCAAC AGGGAAAAGA ATOGAACTTC TTCTATCATC 3720 
CCTGtOGAAA GATCAACJQGT TGOCATTTCA TCCKrTGAGTG GAQRAGGCAC AGACTACAIJC 37B0 
AATGOCTCCT ATATCATGGG CTATTACCA6 AGCAATOAAT TCATCATTAC CCftGCWXXIT 3840 
CTCCITCATA CCATOUGGA TTTGTGGAOa AXGATATGGG HXXXSMOGC CCAACTOGIG 3500 
GTTATOATTC CTGATGGCCA AAACATCGCA GAAQAaTGAAT TTGrPTRCTG GCCZUAIAAA 3960 
GATGAGOCTA TAAATTGTQA GAGCTTTAAB GTCACTCTTA TGGCTGAAGA ACACAAATGT 4020 
CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT TAGAAGCTAC ACAGGATGAT 4080 
TA3X5TACTTG AAOTGAQGCA CTTTC3«5TOTr CCTAAATOGC CAAATCCSiGA TASCOCCATT 4140 
AGTAAAACIT TTGAACTTAT AASTGTTATA AAAQAAGAAG CTGGCAAT3U3 GGADGGGCCT 42 QO 
ATGAXTGTTC ATGATraiSCA TGGAGGAGTQ AOGGCAGGAA CITTCXGTGC TCTGACAACC 4260 
CTTATCCACX: AACTAGAAAA AGAARA^TGC GrrGGATOTTT ACCRGGTAOC C2UW5A1GATC 4320 
AATClGArCGA GGCCAGQAGT CTTTGCTGAC AlTGRGCAGT ATCAGTlEICT CTACAAAGTtS 4380 
ATCCTCAGCC TTOTeftGCAC AAGGCAGQAA GAfiftATOCSlT CCACCTCICT GGACAGTAAT 4440 
GOTOCnaCAT TGOCIGATOG AAA'l^ATAGCT GAjQAGCTIAS AGTCTTTAGX TTAACACftGA 4500 
AAGGGGTGGG aOGACXCACA TCTGAGCATT GTTTTOCTCT TGCTAAAATT AGGCROGAAA 4560 
ATCAtnClAG TTCTGTTATC rGTTGATTTC CX^iTCACCTG ACWSTAACIT TCKCGACATA 4620 
GGATTCTOCC GCCAAATTTA TATCATTflAC AATGTGTGCC TTTTTGCAAG ACTTGTAATT 46B0 
TACXTATTAT GrTTGAACTA AAATGATTGA ATTTTACAGT ATTTCTAAGA ATGGAATTGT 4*740 
GGTATTTTTT *rCI6XATTGA TTTTAACASA AAATTTCAAT TTATAGMQT TASGAATTC3C 4900 
AAACTACnGA AAATGTTTOT TTTTRGTQTC AAATTTTTAG CTBTATtTGT AGCAATTATC 4B60 
ABGrriQCTA GAAATATAAC TTTTAATACA GTAGCCTGTA AATAAAACAC TCTTCCATAT 4920 
GATATTCAAC ATTTTACAAC TGCAGXATTC ACCTAAAGTA GAfiATAATCT GTTAUTi'ATT 4930 
OTAAATACTG CCCTAGTGTC TCCATGGACC AAAMXATAT TTATAATTST AGATTTTTAT 50 40 
ATTTTaCinC TfaAGXCAAGT TTTCXAiSrCC TGTQTAATTG TnAGTETAA TQACIGTAGTT 510O 
CATTAGCTGG TCTTACTCTA CC3«3TTTTCT GACATJlGTaT TGTGTTACCI AAGTCATTAA 5160 
CTTTGTTTCA GCATGXAATT TTAACTTlTG TG6AAAATAG AAArCACClTC ATOTTGAAAG S220 
AAGTTTrTAT <3iGAATAAJCA C3CTTACCAAA CAMGTTCAA ATGGTTTTTA TCCAAGGAAT 52 80 
TGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 5340 
AAA 

6eQ[ ID nOs 376 Protein Becmenee s 
Pzoteia Ace«SBlon #i BOS sequence 

1 11 21 31 41 SI 

I I I I I I 

MRILmPliAC! I0hLCVC3aJ> VAIilGYYKQQS KIiVBEIGWSy TGAUHQKHVIG KBCYPTOSSIPK 60 
QSPniII>EDL TQVHVllLKSai KFQGHDKXGL EHTPrHNTGlC TVEIMLTNDy RVaOOVfiEMV 120 
FXASKXT9EW OKCNMSSDGS BHSLiSSQKFP LBMQZYCFDA DRFSSFEEAV HESKGKIiIlAZiS IBO 
t U^BVOra BW LOEPKAIIHGV SSVSRFGRQA AIJ3PFII»LHXi l^mtaTDHSm 1Q9GSX*T3PFC 240 
TDTVCWIVFK DnVSIBESQI* AVPGBVI*1MQ QSG!rV11U4I]Y LQHNFRCQOir XFGRQVFSSy 300 
IGKBEIHEIAV CSSBPEHVOA DPENYTSLIiV TWEHPRWVCT THIEKPAVLY QQMOSEDQTK 360 
HBPLTDQYQD LGAILKKIiLP HHSYVIiQlVA ICnfiaiiYGKy SDQLTVDHPT DHPSLDLPPE 420 
U.&SBeilXS SEEBKDIBBG AIVNPGRPSA TI9QIRKKBFQ XfiTTTHYlIRI GTKYNBAKTH 4B0 
R8PTBGSEFS WSDVCHTSXi IflSTSQFVXKL KIBKDZ&l/rS QTVIEIiFPHS VBGT&ASLND 540 
GSKXVUeSEH MHIiSGTAESL nrVBlTBYBB BSLLTSFKUD TGAEISSaGfiS PATSAIPFIS 600 
ENISQGYIFS SEMPBTrTYD VI/IFEGARlTA SEDSTSSG6B ESUCDPSHBG NVWFSSSTDI 660 
tAQPDVGSGR ESFliQTlUYTB IKVDESEKTT KfiPSAGPUMB QGPSVTDLEM PHYSTPAYPP 720 
TEVTPRAFIF SSBiOODIiVST VNWYSQTTQ PVYWBASNSS HESRXGLABG X1B6BKKAVXP 780 
IrVXVSACTFX CLVWLVQILI YTOtKCFQXAH FSTLEDSTSPR VISTPPTPIP PISDDVGAIP 840 
IKBFPIOIVAD LHAS&GFTEE: FBBVQSCTVD I^XADSSHH PDNKHKSmYI NIVAYDHSKV 90 0 
KLAQLABKDG KLTDVXMANy VDOVmSFKAY lAAQORLKST AEDFHBHXHB HfilVEVXVHlT 960 
ZnjVElQSEtRKC JSQYnPADGSB EnrGIIFI*VTOX SVQVZiAYYTV fiNFTLTtNTKZ KKGSQSGRFS 1020 



1099 



wo 03/042661 



PCT/US02/36810 



GRWTOYHYT QWPDWGVPEY SIjPVIjTPVRK AKYAKRHAlVG PVWHCSAGV BKrOTYIVLD lOBD 
SHtiQQIOHBG TVNIFGFLKH IRSQUNYI-VQ TEEQYVFIHD TI.VEAII.SKE TBVLD8BIBA 1140 
YViaaLI.IPGP AQKTKIjEKQF QliSQSNlOO SDYfiAALPOQC NREKNRTSSI IPVKRSRVGI 1200 
SSIiSGEGTDY INASYIHGYY QSNEPIITOH PIJiHTIKDFW BMIWDHNAQL WMIPDGQNM 12 60 
5 AEDEFVYWEN KDBPIHCBSF KVTLMAEEflK CLSNEEKLH QDFII^EATQD DYVIiBVRHFQ 1320 
CPKWPHPDSP ISKTPELISV IKEEAANRDG PWIVHDEHG6 VTABTPCALT TLMHQ LBKBW 1380 
SVnVYQVAKM IKIIJ4RPOVFA DIEQYQFIiTfK VUjSZjVSTRQ BBNPSTSLDS NGMLPDOHI 1440 
ASSliBSLV 

10 

Seq ID MO; i77 aequence 

NticXelc Acid ftcceesioo. #i BOS sequence 

Codixig sequence: &oi-4^14 

15 1 11 21 31 41 51 

CTkOVCATAOG CAOQCACGAT CTGACTXaaA TCTATACACT GC3AGGATTRA AACAAACAAA 60 
CAAAAAAAAC ATTTOCTTOG CTCCCCX3'OC CTCTCCACTC TGAGAAGCAG AGGAGOCOCA 120 
OGGOG^OGGG OOBCAt3A£!C6 TCTGGAAATQ CGaATCCTAA AGOSTTTCCT 0C3CTTGCATT 180 
20 CnSCTCCTCT GTQTTTGCCG CCTa<3ATTGG GCTAATG<3AT ACXACAGACA AC&GftG2VAAA 240 
CTTCalTSAAG AGATTGGCTQ GTCCTATACA. GQAGCACIGA ATCAAAAAAT TOGGQAAAQA 300 
AATATCCAAC ATC5TAATAGC CCAAAACAAT CTCCTATC31A TATTGATQAA GATCITACAC 3«0 
AAj&TAAATGT GAATCTTAAG AAACTTAAAT TTCAGGGTTG OGATAAAACA TCATTQGAAA 420 
ACACATTCAT TCATAACACT OQQAAAACAG T<3GaARTTAA TCTCACTAAT GACTACCGTQ 460 
25 TCAGOQGAdGG AQTTTCAGAA ATGC5TOTTTA AAGCAAGCAA QATTAACTITT CACIGGGGAA 640 
AATOCAATAT GTCATCTGAT GQATCAGASC ATAQTTrAQA AGQACAAAAA TTTCCACTTG 600 
AGATCCAAAT CTACTGCTTT GATGOGQACC GATTTTCAftQ TTTTGAjOGAA aCAGTCAAAG 660 
GAAAAGGGAA GTTAAGAOCT TTATCCATTT TSTXTGAGGT TOGGACAGAA GAAAATTT«S 720 
ATTTCAAAQC OATTAlTGAT eOAOTOGAAA GTGTTASTC6 TTTTGGGAAQ CAGGCTGCTT 780 
30 TAGATCCATT CATACTGTTG 2\AOCTTCTGC CAAACTCAAC TQACAJW3TAT TACATTTACA 840 
ATC3GCTCATT GACATCTCCT CCCTGCACftG ACAC3U5ITGA CTGGATTOTT TXTAAAGATA 300 
CAGTTAGCAT CTCTQAAAGC CAiGTTGGCTO TTTTTTGT^ AGTTCTOrACA ATOCAACS^T 960 
CTGGTTATOT CAK5CIGATG GACTACTTAC AAAACAATTT TOGAGAGCAA CAGT ACAA GT 1020 
^ _ TCTCTAGACA GGTGTTTTCC TCATACACTG GAAAGGRACSA QAITCaTGAA GCAOTTTGTA 1080 
35 GTTCA<3AACC AflAAAATGTT CAOGCTGACC CAOAGAATTA TACCAGCCTT CTIGTTACAT 1140 
GGGAAAGACC TCCSAQTCBTT TATGATACCA TGATTGAGAA GTTTGCfUSTT TTQTAQCAfiC 1200 
AGTTOGATGG AGAGGACCAA ACCAAGGATG AATTTTTGAC A6ATGGCTAT CAAGACTTOQ 1^60 
GlGCTATXCT CAATAATTTG CTACCCAATA TGAJ3TTATOT TCTTCRSATA GTAOCCATAT 1320 
GCACTAATGG CTTATATGGA AAATACAGOG ACCAACTGAT TGTOGACATQ CCOACTGATA 1380 
40 ATCCTGAACT TGATCTTTTC GCTGAATXAA TTGGAACT6A AQAAA'XAATC AAgGAfiflASG 1440 
AAjGAGGGAAA AGACATTGAA GAAOGCGGTA TTOXGAATCC TGG-TAGM3AC AGTGCTACAA 1500 
AOCRSAXCRG GAAAAAOGAA CCCCAGATTT CXACCACAAC ACACTACAAT OSCATAOGGA 1560 
GSAAATACAA TGAA6GCAAG ACTAACCGAT COCXMCRAG AQGAAGTQAA TTCTCTGGAA 1620 
AlGGGTGATGT TOOCAATACA TCTTTAAATT CCACTTCCC3V ACXZROXCACT AAATTAOCCA 1680 
45 CAGAAAAAGA TATTTCCTTG ACTTCTCASA CTGTGACTOA AClGCXaCCT GRCACIGTGG 1740 
AAGQTACnC AGCCTCTTTA AATGATOGCT CnU^AACTGT TCTTAI3AXCT CCACATATQA IBDO 
ACTTGTOGGQ QACTQCaGAA TOCTTAAATA CAGTTTCTAT AACAGAATAT GAOGAGGAGA 1860 
GTT^ATOGAC CAGTTTCAAQ CTTGATACTG <3RSCTGA3«3A TTCTTdAQGC TCJCaOTOOOe 192 D 
CAACXTCTOC TATCCCATTC ATCTCTC5AGA ACATATC5Ct3^ AGGGTATATA TTTTCCT008 1980 
50 AAAACCCAGA GACAATAACA TATGATGTCC TZATAGCAGA ATCTQCTAGA AAIQCTTC2C36 2040 
AA£3ATTC3U\C TttMCftOGT TC3M3AAGAAT CACTAAAGGA TOCTTCIATQ OaGGGAAATO 2100 
TGTGGTTTCC TAGCTCTACA GAGATAACM CACAGCCOGA TGTTGGATCA OGCA0AGAGA 2160 
OCTTTCTCCa G&CTAATTAC ACTGAGATAC GTOTTE3ATGA ATCTGAQAA3 ACAACCAAGT 2220 
CCTTTTCTGC AGGCCCAGlG ATGTCACRGG GTOOCTCAGT TACRfiATCTG GAAATGCCAC 2280 
55 ATTATTCTAC CTTTCOCTAC TTCGCAACTG AQOTAACACC TCATQCrTTT AOCCGATOC? 2340 
OCASACAACA GaATTTGGTC TCCAiCGGTCA A0SIGGTA1A CTCGCAGaCA ACGEAAOCGS 2400 
TATACAAT13A GGCCAGTAAT AGtAOCCATG AGTCTCQTAT TOSTCTA0CT GAGGGGTTGG 2460 
AArPCOSAGAA GAAGGCAGTT ATACCCCTTG TGATOGTGTC AdCCCTGACT TTTATClGTC 2520 
TAOTGOTTCr TX3TGGGTATT CTCRTCI7«7r QGRSGAAATG CTTC3CAQACT GCACACTTTT 2580 
60 ACTTASAGGA OtfJEACATCC OCTAOAfiTTA TATCCACACC TCCAACACC7 ATCTXTGCRA 264D 
TTTCAGAT6A TCTOGOnQCA ATTCX3WTAA AGC!ACTTTGC AAAGCAXGIT GCAGATTT&C 2700 
ATGCAAGTA6 TGGCTTTACT ISAAGAATTTG A0ACACTGAA AGAST7TTAC GAGGAAGTGC 2760 
AGAGCTGTAC TOTTCJACtTA GGTATTACftG CAGACAGCTC CAACCAOCCA GACAACAAGC 2BS0 
AiCAAQAATOG ATACATAAAT AUCGTTGOCT ATQATCATAG CAGGGTTAAG CTAGCACAlQC 2880 
65 TTCCTGAAAA OGAIGGCAAA CTOACTGAIT ATATCAATGC CAATTATSTX GArGGCTACA 2940 
ACAGAOCAAA AGCTTATATT GCXGCCCMa QCOCACXGAA ATOCACnSCr GABCTTTTCT 3OO0 
GGAGAATQAT ATGGGAACAT AATGTaOAAS TTATTOTCAT GATAACAAAC CTCGIOSAGA 3060 
AAGGA7U3GAS AAAATGTGAT CAGTACIGGC CTGCOGATGG GAGTGAOGAG TA OGOO AACI 3120 
TTCTG3TCAC TCCAGRAG2MJT GTGCAAGT6C TTGOCTATTA TAC3?GTGAGO AATTTTAGTC 3180 
70 rAAOAAACAC AAAAATAAAA AAGOGGICOC AGAAAGGAAG AGCCAGTOQA GGTGTGGTCA 3240 
CACA6TASCA CTACAOQCSfi VBGCCTOACA TGGGAGTAOC AOAQIiAlCXCC CIOGCnSlGC 3300 
rrGACCTTTGT GAGAAAGGCA GCCTA1?GGCA AGCGCSCATGC AGTGGGGOCT 6TT6TOGTGC 3360 
ACTGCAGTGC TOQAOTTOQA ASAACAGGCA CA1ATATTGT GCTAGRCaGT ATGTTGCAflC 3420 
AGATTCAACA CGAAGGAACT GTCSKACATAT TTQGCTTCTT AAAACACATC CGTTCACAAA 34 SO 
75 GAAATTATTT GOTACAAACT 6AGGAGCAAT ATGTCTTCAT TCATGAlTACA CTGQTXGRGG 35 40 
OOVTACZZAfi TAAAGAAACT GAGGTGCXGG ACnSlCtVXAT TCATGOCTAT GTXAATGCAC 3600 
TCCXCATTOC TGGtACCAOCA GGCAAAACAA AGCTAOAOAA 2UCAAITCCA6 CTGCTGAGCC 3660 
AGfTCAAATAT ACRGCAGAGT QACrAnXTPG CAGCCCTAAA GCAATGCAAC AGGGAAAAGA 3720 
ATGGAACTTC TTCTATCATC CCTGTGGAAA GATCAAOQC5T TGGCATTTCA TCCCTGAGTG 3780 
80 GAGAAGGO^ AGACTACATC AATGOCTGCT ATATCATGGS CTATTACCAG AGCAATGAAT 3840 
TCATCATXAC CCAGCACCCT CTCCCTCATA CO^TCAAGGA ITTCTOQAGG ATGAlSKTCGG 3900 
ACCATAATGC GCAACTGGTG GTTKXGKETC CTGATOGCCA AAACATGGCA GAAGATGAAT 3960 
TIOXTTACTQ GOCAAAXAAA GATGAOCCTA TAAATTGTGA GAOCTTTAAa GTC ACTC TCA 4020 
TOQCTCAAGA ACACAAATGT CTATCTAATQ AGGAAAAACT 7ATAATTCAG GACTTTATCT 40 BO 
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TAGAAGCTAC ACAGOATGAT TATOTACTTG AAGTGAOGCA CTTTCAGTGT CCTAAATGGC 4140 

CAAATCCaCA TACCCOCATT AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAI3 4200 

CIGCCAATAG OBATQGGCCT ATOATTOTTC ATOATGO^GCA TGOAGGAOTG ACQGCaG GftA 42fi0 

CrTTTCmSTGC ^TCTGACAACC CXTATGCaCC AACTAGAAAA AGAAAATTCC srGGATOTTT 4320 

ACCAGGTAGC CAAOATGATC AATCTGATGA GSCCAGGAGT CTTTGCTQAC ATTOASCArST 4380 

ATCft3TTTCT CXACAAAGTO ATCCTCACSCC TTGTGAGCAC AAGGCRCGAA QRGAATCCAT 4440 

CCACCTCTCT GQACACSTAAT GGTQGAGCAT TGCCTGATGG AAATATAGCT GAGAQCTTAG 4500 

AGTCTTTAGT TTAACACRGA AAflSGGTGGO OGCaCTCW^ TCTGAGCATT GTTTTCCTCT 4S60 

TCCTAAAATT flGGCRGOAAA ATCTVSTCrAG TTCTGTTATC TGTTt5ATTTC CCATCACCTG 462Q 

ACAGTAACTT TC3V1GACATA GOATTCTGCC GCCAAATTTA TATCATTAAC AATGTGTGCC 46 SO 

TTTTTGCaAG ACTTGTAATT TACTTATTAT GTTTQAACTA AAATGATTGA ATTXXACAC3T 4740 

ATTTCTAAGA ATGGAATTGT GGTATTTTTT TCT6TATTGA TTTTAACAGA AAAXTTCAAT 4800 

TTATAGAGCSX TAGGAATTCC AAACTACaGA AAATOrmtSI TTTTAGTGXC AAATTTTTAG 4S60 

CTGTATTTGT AGCAATTATC AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTQTA 4920 

AATAAAACAC TCTTCXaiTAT GATATTCAAC ATTTTACAAC TGCASTATTC ACCTAAAGTA 4980 

GAAATAATCT GTTACTTATT GTAAATACTQ CCCTAGTGTC TCCATQGACC AAATTTATAT 5040 

TTATAAITGT AGATTTTTAT ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTCTAATTG 510 0 

TTTAGTTTAA TGAC3GTAGTT CAlTAGCTOQ TCTlACTCTA CCAGTTJTTCT GACATTQTAT S160 

TGTOTXACCT RAGTCaTXAA ClTSOmCPi GCATQTEAATT TTAACXTTTG TGGAAAATAG 5220 

AAATACCTTC ATTTTGAAAG AAQTtCTTTAT QAfiAATAACA CCTTACCRAA CATTGTTCAA 5280 

ATOGTTTTTA TCCAACSOAAT TGCAAAAAIA AATATAAATA TTGCCATTAA AAAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAA 



Seq 10 NO I 376 Protein fiaqoenceT 
Protein Acceflsion #; EOS aeq[uence 

1 11 21 31 41 51 

MVnCASKITP HWGKCHMSSD G&GSSLEGQK FPLBMQIYCP DADRFSSPHB AVKGKGKUJA 60 

IiSIIjFEVGTB ENLDPKA11S> GVESVSRFGK QAALDPflLIr UliriPNSTDKY YIYNGSIjTSP 12 O 

PCIDTVDWIV PKDTV8ISBS QLAVFCBVIiT MQQSGYVHU9 DYLQNNFSEQ QYKF8RQVFS 180 

SV7GKEBIHE AVCSSEPBilV QADPEHYTSL LVTHBRPKW miMIE KFAV IiYQQLDGEDQ 240 

TKHBEXiTDOY ODLGAILBOSI* I^MMSYVLOI VAlCTHGIira mfSDQLJVDM V1ZINFBU>IjP 300 

PEaWGTBBlI KESBEeKDZE SGAIVHPGBD SMBQIRXKB FQISTTimH ItlGTKSNEAK 3 60 

TNRSPTR6SE FSGBKaWPNT S12ISTSQFVT lOUATEKDISD TSQTVTEIiPP HTVEGTSASIi 420 

KDGSKTVLRS PHMNLSGtAE SWITVSITEY EBESLLTSPK tDTGAEDSSG SSPATBAIPP 4B0 

rSEEnSQGnrX FSSEHPETIT ^BVLIPESAR MACBDSTSSG SEBSIiKDPaH BGNVHPPSST 540 

D1TAQFDVG8 GRESFWCBY TBXItVDfiGEK TTKSFSAGPV MSQ&PfiVIDL EHPEYfiTFAY €00 

PPTEVTPHA7 TPSSEiQQfElIiV filVMWYSQT TQPVyHEAaS &GBE8RXQIA SGLE5EKKAV 660 

IPtVlVfiALT FICI.WLVGI LIYWHKCFQT AHFYUBDSTB PKVISTPPTP IPPISDDVGA 72 O 

IPIKHPPKHV ADUlRSSGPr EEFETIfKEPY QEVQSCTVDL GITADSSKHP iJNKHKlffRYIM 780 

XVATDHSKVK LAQIAEKDGK LTDYIMANYV DGOTHPKAYI AAQaPLKSTA BDFHRMIWEH B40 

NVBVIVMITN XiVSnSRRSQCD QyWPADGSEB TGHFLVTQKa VCfVLAYYTVR KFTLRHTKIK 900 

KGSQW5R»8G WmQWrXQ WPIfflGVPEXS I.PVI.TFVRKA ATTAKllHAVGP VWHCSAGVG SCO 

RTGTYIVXiDS HLQQIQHEGT WIFGFLKHI RSORMYLVOT BEQYVFITOT IiVBAILSKBT 1020 

BVLD3HIHAY VKAItlilPGPA GKTKIiEKQFQ liLSQgWIQQS DYSAALKQCN REKHRTSSII 1080 

PVER8RVGXS SLSGEGTDYI WAS^IMGYYQ SiaEFlITQBP LUmSaJEWE HIHDHNftQWT 1140 

VHXPDGQNMA BDEFVYHBEIK DEPIflCBSPK VTKMAEEHKC LS NgBKI allQ PF1I>BATQDD 1200 

YVI^VRHFQC PHHSNPDSPI SKlFEblSVI KSBAfflSIRDGP KCnfflDBHCGV TftGTFCAIOT 1260 

IMHC^^KENS VranrOWUSMI HUCtEQVPAD IECTOFI.YKy IIW/BTRQB EHPSTSU^SN 1320 
GAJOiPDORlA BSLESXiV 



Beg ID HOs 379 T3WA s<equence 

Hudelc Acid AccesBion #■ &0S saouence 

Godlsg sequence t 148-4632 

X 11 21 31 41 SI 

I t I 1 ) ^ 

CACACATAOG CACGCACGAT CTCACITC33A TCTATACACT GGAGGATT7UV AACAAACZAAA 60 

CAAAAAAAAC ATTTOCTTCa CTOOOCCTCC CTCrfCCACTC TQAGAAGCAG AGGAGCOGCA 120 

CGOOSAGGGG OCGCAGACOG TCTGOAAATG OGAATCCTAA AACS3TTTOCT OGCTTGCATT 180 

CAGCTCCrCT GTGTTTOOaa CCTGGATIGS GCTAATGGAT ACTACAGACA ACAGA(»AaA 240 

CTTGTTGRftG AGATTGGCTG QTC3CTATiWa OOAOCACTGA ATCAAAAAAA TTQGGGAAAG 3O0 

AAATKTOC3A CATOTAAECAG OXSWVAACaA TCTOOTCTCA ATATTGATGA AGATCPTACA 360 

tSRAGTAAATO rEQAATCTXAA GAAACTTAAA TTTCAQGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGQGAAAACA GTGGAAATTA ATCTCACtAA TGACOaCCGT 4B0 

GTCnSOGGAG QAGUrrCAGA AATGGTGTTT AAAfiCAAGCA AGATAACTTT TCACTGGGGA 540 

AAVEQCaaa^A TOTGATCXGA TOGATCHSAG CKEACTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGA1GCAAA TCTACTGCTT TGKiGCGGHC OBATITTCAA QTTTTGAGGA AGCAGTCAAA 660 

QGRAAAGGGA AfflTTAAGAflC TTTATOCATT TlGTTTGAOG TTGGGACAiaA AGAAAATTTO 720 

GATTTCAAAG CGATTATTGA TGOAGTCGAA AGTGTTAGTC QTTTTGGGAA GCRGGCTGCT 7B0 

TTAGATOOVr TCMACTSTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC B40 

AATQGCTCAT TGACATCTOC TOCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

AGAGTTAGCA TCTCTGAAAiS CCAGTTGaCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTAXG TCATGCTGAT QOaCTACTTA CAAAACAATT TTCGftfiAGCA ACA GXACAAG 1020 

TTCTCTAGAC AQGTGTTTTC CTGRTACRCT GGAAAGGAAG AGATTCATGA AGCAGrTTGT 1080 

AGTTCWSAAC CAGAAAATGT TCAG6CWSAC CX3M3ACAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAA6AC CTCGAGICGX TTATOATACC ATGATTGAGA AGTTTaCAGX TTTGTAOCaG 1200 

CAGTXGGATG GASAGGACCA AACCAAGCAT GAATTXTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCRATAATTT GCTAOCXAAT ATGAOTTATG TTCTTCfiGAT AOTAQCCATA 1320 

TGCavCTAATG QCTTATATGG AAAATACAGC GACCAACTQA TTGT0GAC3VX GCXXACXGAT 1380 

AATGCTGAAC XTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAOBAG 1440 

GAAGAGGGAA AftGACATTGA AGAAQGOGCT ATTGTGAAITC CXGOPASAGA CACTGCTACA 15O0 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACXa^XA CACACTACAA TGGCATAGGG 1560 
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ACGAMVTACA ATGAABCCAA. GJCTAACCGR. TCCCCfMCPA QAGGAflGTGA AITCrCTQGA 1620 
AAGGGXeATG TTtKCRATAC ATCTTTAAAT TOCACTTCCC AACCftGTCAC TAAATTAGCX: 16B0 
ACAGftAAAAG ATATrrCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTSTG 1740 
GAAGSTACTT CAGCCTCTTT AAATGATOGC TCTAAAACTQ TTCTTAGATC TCC&CATATG 1800 
5 AACTTGTCGG GC3ACTGCAQA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TOAGGRGGAG IB 60 
AGTTIATTGA CJCaGTTTCJKA QCTTeATACT OSaGCrGAAQ ATTCTTCAGQ CTCCWSTCOC 1920 
GCAACTTCTG CTATCCC&TT CATCTCTGAG AACRTATCCC AAGGGTATAT ATTTTCCTCC 1980 
GAAAACCCAG AGACAATAAC ATATOATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 
OAAGATTCAA CTTCATCMG TTCAGAAOAA TCACTAAAGG ATCCTTCTAT GGftSOGAAAT ZlOO 
10 GTGTGGTTTC CTAGCTCTftC AG&CATAACA GCACSVQCCOG ATGTTGQATC AGGCAGAOAG 2160 
AGCTTTCTCC AGACTAATTA CACTQABArrA OGTGTTQA'CG RATCTGAGAA GACAACCAAG 2220 
TCCTTTTCTG CAGGCCCAGT GATGTCACAiG GOTCCCTCAG TTACftGATCT GC^AATGCCA 2280 
CATTATTCTA CXTTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 
TCCAGACAAC AGGATTTGGT CTCCACSGTC AACGTGGTAT ACTOGCAQAC AACCCAAOOQ 2400 
15 OTATACAATG AGGCCAQTAA TAGTAGCCAT GA<5rCTC3GTA TTOGTCTAGC TOAQGOffTTC 2460 
GAATCCGA3A AGAAGGCAGT TATACCGCTT 6TGAT0GTQT CAGCCCTGAC TTTTATCTOT 2520 
CTAGTOGTTC TTOTOOGIAT TCTCATCTAC TGCSAOSAAAT GCTTCCAGRC TGCAjCACTTT 2580 
TACTTAGAGG ACAGTACATC CCXTTAGAOTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 
ATTTCRGATG ATOTOGGAGC AATTCCAATA AAGCACTTTC CAAASCATGT TGCAGATTTA 2700 
20 CATGC3UW3TA OW3GGTTTAC TQAftGAATTT OAGACACTGA AAGAGTTTTA CCAGGAAGTG 276 B 
CAGAGCTGTA CrGTTOftCTT AG6TATTACA GCAeACftGCI CCAAOCACX:C AGAC&ACAAG 282 D 
CaCAAGAATC GATACATAAA TAIttTTQCC TATGATCATA GCAGGGTIRA GCTAGCACAG 2880 
CTTOCTGAAA AGGATGGCAA ACTQACTQAT TATATC&ATG CCAATTATGT TCaATQGCTAC 2940 
AACAGACCiAA AftSCTTATAT TGCTGCCCSiA GGOCCACTGA AATCCAC3«3C TGAAGATTTC 3000 
25 TGGASAATGA TATGGGAACA TAATGTGGAA GTTATT6TCA TQATAACAAA CCTCGTGGAG 3060 
AAAGGAAQGA GAAAAI^TGA TCAGTACIOG CCTGOCGAlQ GCnSTOAGGA GTACQOGAAC 3120 
TTTCTGGTCA CICAGAAGAlG TGreCAAGTO CTTGCCTATT ATACTCTGAG GAATTTTACT 3180 
CTAA(3ftAACA CAAAAATAAA AAAGOGCTCC CAOAAAGGAA QACCX2AGTGG AOOTOrGGTC 3240 
ACACAGTATC ACTACACGCA GTGGOCTGAC ATGGGAGTAC CAGACnCACTC CCTGCCAGTG 3300 
30 CTGACCTTTG TGAOAAAGQC AGCCTATGCC AAQOGOCATG CAQTOGGGCC TGTTGTCGTC 3360 
CACTQCAGTS CTGGAGTTaa AAGAACAGGC ACATATATTQ TQCTAGACSUS TATGTrOCIUS 3420 
CAGATTCSAC AOQAAGGAAC TOTCAACATA TTTGGCTTCT TAAAACRCAT COGTTCACRA 3480 
AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 3S40 
GCCATACTTA GTAAAGAAAC TGAGQTGCTG GACA13TCATA TTCATGCCTA TOTTAATGCA 360O 
35 CTCCTCATTC CTGGACCAGC AGGCAAAACA AAfiCTAflAGA AACAATTCaa. OGGTCTCACT 3660 
CTGTCnCCGA GGCTOOftSTG CAQAifleCACA RTCTCO0CTC ACTGCAAOCT TCCTCTCCCT 3720 
GGCTTAACT0 ATCCTCCTAC CTC3«5CCPaC GGftGTGGCTQ GGACTATACrT CCIGAGCCAG 3780 
TCAAATATAC AGCAGASTGA CTATTCTGCA GCCCTAAAGC AATGC3UiCRG GGAAAAGAAT 3840 
CGAACTTCTT CTATCATCXX TGIGGAAAGA TCAAGQGTXG iSCATTTCATC CCTGAGTOQA 3900 
40 GAA0G<3kCAG ACTACATCAA TGCCTCCXAT ATCAOtSGGCT ATTftCCMSAG CAATGAATTC 3560 
ATCATTACCC AGCROCCTCT CCXaXATACC ATCAaSGBOT TCTGGAGQAT GATATGGQAC 4020 
CATAKIGCCC lUyCTGfflXSGT TATQATTCCT GATGGGCAAA AC3.TGGCAGA AGATGAATTT 4080 
erTTACTGGC CAAATAAAGA TSAGCCTATA AATTGTOAGA GCTTTAAGQT Ca^CTCTTATQ 4140 
GCTOAASAAC AC3UVAT6TCT ATCTAATGAG GAAAAACTTA TAATTC3M3GA CTTTATCTTA 4200 
45 QAAGCTACAC AGGATGATTA TGTACTTGAA GIGAGGCACT TTCAGTGTOC TAAA1GGOCA 4260 
AATOOtOATA GGGCCATTAG XAAAACTTTT GAACTTATAA GTGTTATAAA AGKAGAAQCT 4320 
GCCAATAGGG ATGSGOCTAT GATTGTTCAT GATGAGCATG GauCSGAGTGAC GGCAGGAACT 4350 
TTCTGTGCTC TGACAACCCT TATGCACC!aA CTAQAAAAAG AAAATTC3Qt3T QGATOTTTAC 4440 
CAGGTAGCCA AOATGATCAA TCIGATGAGG CCRQGAGTCT TTGCTGACAT TGAGCRGTAT 4500 
50 CAGTTCCXCT ACAAAGTOAT CCTCAaCCTT GTGOGCACAA GGCAGOAAGA GA ATOa aOC 4560 
AOCICTCTQG ACAGTAATGG TGCAGCKTTQ CCXGATGOAA ATIOKSCrGA SAGCT TOgAG 4620 
TCTTTAIGTTT AACACAGAAA GGGGT0GG06 GACTCACATC TQAS3C3VIT6T TTTCCSCTTC 4680 
CTAAAATTAO GCAGGAAAAT CAGTCTAGTT CTGTTATCTO TTGATTTCCC ATCACCTOAC 4740 
AGraiflCTTrC ATGA«2ATAGG AITCTGCCGG CAAATTTATA TCATTAACAA TGTGTGCCTT 4B0O 
55 TTTOCRAGAC TTGTAATTTA CTTATXATaT TTGAACTAAA ATGATTSAAX TTTAC AGTAy 4860 
TTCIAAGAAT GQAATIGTOG TATTITTITC TGKCAITGKTT TTAAGAeAAA ATTirCft&TTT 4920 
ATAGAGGTXA GGAATTOCAA ACIACAOAAA ATGTTTGTTT TTAGTOTCAA AT TTTTA flCT 4980 
OTKHTTGTAG CAATTTATCAG GTTTGCTAGA AATAXAACTT TTAATACAGT A£3C!CTGTAAA 5040 
TAAAACACTC TTOCATATGA TATTCMCAT TTTACAACTG CAGTATTCAC CTAAAQTAGA SI 00 
60 AATAATCTGT TACTTATTOT AAAaCACTGCC CTAGTGTCrC CATGGACCAA AarTTATATTT 5160 
ATAATTQIA6 ATTTTTATAT TTTAiCTACTG A0TC3kAGTTT T CPMar gCIG TGTAATTGTT 5220 
TAGTTTAATa AGGTAOTTCA TTAsSCTGGTC TTACTCTACC AGTTTTCTT3A CATTGTATTG 5280 
TGTTACCTAA GTCAITAACT TTGTTTCAGC ATOTAATTTT AACTTTTGTG GAAAgTAGRA 5340 
ATAOCTTCAT TTTGAAAGAA GTTTTTATGA GAATAAOiCC TTACCAAACav TTGTTCAAAT 5400 
65 QGirtTTATC CAAGGAATT6 CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 5460 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID KO: 380 Pxotftln aeguence i 
ErotelxL AcceesLoQ #s BOS eequence 

1 11 21 31 41 51 

I I I I i I 

MRIIiTCRtlAC IQLliCVCftLD WAHGrSfHQQR KIjVEBIGWSY TOALHQKNNG KKSTPTCNePK «0 

QSPIHIDKDi TCfVKVNliKXL KPQGWDKTSL EHTPIHNTQK TVEINMMDY RVSGGVSEMV 120 

75 FKASKITFHW QKC33K6SDG6 EBSI^BGQKFP LEHQIYCFDA DRFSSFBBAV KGKBKIiRALS 180 

XUnSVGTEEEf UDFECAXIDGV ESVSRFGRQA IOSHSVIImUHIm IiPHSXHKXYX YNGSIiTSPPC 240 

TDTVDWIVPK DTVSXSBSQL AVFCSVItOMQ QSGYVIflWY IdONMfBEQQT XFSRQVFfiSX 300 

TXSKBEIHEAV CGSEPENVQA DFESTYTSIiliV TtfERFftWyD TMIBKPAVi»Y QOLDGEDOTK 360 

HEPIiTDGYQD LOAIOOHJ:,? NMSYVLftlVA ICTWGLYGKX SDQLIVDWPr DNTPELDLFPE 420 

80 LIGT^IXKE BEBGKDIBBG ATOTPGRDSA TMQIHKKEPQ ISTTTHMIRI GTKXWKAKIN 480 

SSFI3U3SEFS GXGDVFMTSL HSTSQPVTKI> ATSKDZSLTS QTVTEIiPSaT VEGXSASLVnS 540 

GSKTVURSSa MllI>a<3T»E&£i NTVBXTEYEB BBIiIiTfiFKIpD TGAEDSSGSS PAT8A3PFXS 600 

BHX8QGYIPS SEHPBTZTm VXiIFSSSASHA SEDSTSSGSB BSUCDPSHBG imrFFSSTOZ 660 

HtQVaVaaSBL EaFLQTirrPB IRVDBSEKTT KSPSAGPVMS QGPSVTDLEH PSYSTFATFP 720 

1102 



wo 03/042661 



PCTAJS02/36810 



TEVTPHAFTP fiSHQODLVST VNWYSQTXQ FVYHEASMSS HESRIQIABG DBSBKKAVZP 
I»VIVSALTFI CDWliVGIliI VWHKCFQTAH PYLEDSTSPR VISTPPTPIP Plg lPWS&IP 
IKHPPKHVW> LHASSGFTEE FETLKEPYQE VQSCTVDLGI TADSBSHFEOit OTKHRYINrV 
ATOHSHVKIA QIAEKDGKLT DYrMANYVDG YKRPKRYIAA QGPtiKSTAED FHFMlVnBHNV 
5 EVIVMITNIrV EKGRRKCDQY WPADGSKEYG NFLVTQKSVQ VIAVYTVRHF TI.RNXKIKKO 
SQKQRPSORV VTQiHYTQHP BHOTPEYSIaP VUTFVRKAAY AKREIKVGPW VHCSftS tfGRT 
GTTIVIDSML QQIQFEGTVS IFCFLKHIHS QRWYI.VffrEE QYVFimrniV EZaLSKBTBV 
U)SHJBAYVK AIiTiIPGPAQK TiOiEROFOGI. TLSPRDBCRB TlSAHCNIiPL PtSLTDPPTSA 
SRVAGTl1iZ»S QSHIQQSDye AALKQCNREK HRTSSIIPVE RSRVGISSI^ 6BGTDYINAS 
10 YIMGYYQSNE FIIT<3HPi:.I.H TIKDFWRMIM DHNAQIiWWI PDOQNMJiEDB FVYHENKDEP 
IMCESPKVTI. MAEEHKCI^N EEiCLIJCKJPl LEAT5»1DYVL EVBHFQCPKH EUFDSPISKT 
FBLXSVJKEE AAMKDGPMJV BDBUOGVTAG TFCM/TTLMH QSLEKEtfSVIW TQVAKMINIiH 
RPGVKADIEQ YQFLYKVZL5 IiVGTSQEESIP STSCDStSttSAA LPD6HIASS£* SSLV 

15 

Seq ID hdt 381 DNA secnienca 
Nucleic Acid Acceeeion )]H_002B51.1 
Godiog sequence; 148.^7092 

20 1 11 21 31 41 SI 

] 1 I 1 t I 

C3M»CATACG CACGCAOGAT CTCACTTCGA. TCTATACftCT GQAGGATTAA AACAAACAAA 
CAAAAAAAAC ATTTCXJITCG CTCCCCCTCC CTCTCCACTC TGAGAAQCRG AGSAGCOSCA 120 
OC3GCt3At3QeG OCGCAGACC3G TCIXSGAAATQ CXSft&TCCTAA AGCGTTTCCT COCTTGCATT IBO 
25 CRI3CTCCTCX GTOTTTGOOG CCTK3GATTGG GCIWilQGAT ACTACAGACA ACftQAOftAAA 24« 
CrTGTTOAAG AGATTCJQCTG GTCCTATACA GGAGCACXOA ATCAAAAAAA T!£GGSGAAAO 300 
AAATATCCAA. CRTGTAATAG OGCAAAACAA TCTCCTATCA ATATXGATGA AOKTCrTACA 360 
CAACTAAATO TGAATCTTAA GAAACTTARA TTTCAGOQTT GGGATAAAftC ATCATTGOAA 420 
AACACATTCA TTCATAACAC TGC3GAAAACA OTGGAAAITA ATCTCACTAA TGACTAOC3GX 4B0 
30 QTCAGCGGAG GAOTTTCaVSA AATOaTGTTT AAAQCAAGCA AdGATAACTTX TC ACTOG OaA S40 
AAATQCAATA 'TGTCATCTGA TGGATCIVQAO CATAGTXTA6 AAGGACAAAA AST'i'CCACTT fiOO 
QAfiATGCAAA TCTACTGCTT T0ATGC3GGAC OE3ATTTTCAA GTTTT(5l«3GA AGCSUSTOVAA 660 
GGAAAAGGGA AG^TTAAGAGC TTTATCCATT TTeTTTGAGG TTGGGACAGA ftSAAAATTTG 720 
6ATTTCAAAG CGATTATTGA TGGASTOtSAA AGTGTTAGirC GTTTTQQGAA GCA6GCTGCT 780 
35 TTAGATCCAT TCATACTTSTT GAACCTTCTG CCAAACTCRA CTGACA AGTA TTACATTTAC 840 
AATGGCTCAT TGRCATCTOC TCCCTC3CACA GAGACAGTTG ACTGOATTGT TTTTAAAGAT 900 
ACAGTTAQCA TCTCTGAAAfl CCA6TTGGCI GTCTTTTGTO AACSTTCTTAC AATTGCaACAA 960 
TCTGGTTATG TCATGCSGAT GGACTACTTA CAftAACMTT TTGORiaAGCA ACM3TAC3WG 1020 
TTCrCTAOAC AGGT6TTTTC CTCATACACT CSGAAAGGAAB AflftTTCATGA ASCafSTTT ST 1080 
40 AOTTCAGAAC CAQAAAATGT TCAOOCIGAC OCAOAGAATT ATACCASQCT TCTTGTTACA 1140 
TGGGAAAOAC CTCGAGTOCTT arTATGATACC AgGAJTGftflA AGTTTGCASr TTTSTAOCAG 1200 
CAQTTGGATG GAOftGGlUrCA AAOCAAGC&T GAATTTTTGA CASATGQCIA. TCAAGACTTa 1260 
OSTGCTATTd TCSAATAATTT GCTAOCCAAT ATGAGTTATQ TTCTTCRGAT ACTcnGGCATA 1320 
TGC3LCTAATG QCTTATAIGG AAAATACS^ GAOCAA£9GA TTGTCQACAT GCCTACTOAT 1380 
45 AATOCTGAflC TIQRTCTTTT OCC3X3AATTA ATTGGAACTG AAlQAAATAAT CAAQQAiGGMS 1440 
GAAiGOUSOGAA AAiGACATTGA AGAAGOOQCI ATIGT6AATC CTSGTSUSAOA CMSTGCTACA ISOO 
AACX3VAATCS. OGAAAAAGGA ACCXiCAlGATT tTCTACCRCAA CAiCatfTCACAA TCGQUCAeSe 1560 
ACBAAA3!ACA ATGAAGCCSA GACTAAHzOGA TGOOCAAfi&A eAGCSUVOTOA ATTCTCTQaA 1620 
AAGGGTOATG TTOOCAATAC ATCTTTAAAT TCCACTTCOC AACiCRGTCAC TAAATTAGOC 1680 
50 ACS^SakAAAAG ATATTTCCTT GACITCTCAlS ACTSTOACT6 AACTOCCACC TCRCftCTOTa 1740 
GAAGOTACTT CaGCXTTCTTT AAAIGKIGaC TCCAAAACTQ TTCXTAGATC TCSCRCATATG XBOO 
AACTTGTOGG GOAOTGCMA ATCXITTAAAT ACACyPTTCIA TAACAOAATA rE6AX3GAQQA£» 1860 
AGTTTATIGA CCAGTTTCAA GCTTGATACT GGAGCTGAAQ ATTCTTCAQG CTtXaSTCOC 1S20 
GCAACTTCTQ CTATCCCATT CATCTCTGAG AACA^TOCC AAdGGTATAT ATTTXCCTCC 1980 
55 0AAAACCCAG AORCAATAAC ATA1X3AXGTC CTTASCACCAG AATCIGCTAg ,AA ATCCT TCC 2040 
GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGO AIGCITCTAT QOAOQCAAAT 2100 
GTCJTflGTTTC CTAGCTCTAC AGACS^TAACA GCACRCCOCXS ATGTTGQATC AQGCASAGAG 2160 
AGCTTTCTCC AC5ACTAATTA CACTQAGATA CGTGTTGATa AATCTCaCAA GACAACXSlAG 2220 
TCCTTTTCTG CAGGCCC3U5T GATGTCACWS G6TCCCTCAG TTACAGATCT GGAAATGCCA 2280 
60 CATTATTCTA CCTTTGOCTA CTTCOCAACT 43AG6TAACAC CTCaiGCTTT TADOCCATCC 2340 
TCCA(SVCAAC AOOATTTGGT CTOCAC66TC AACSTGOCAT ACTC9C!aSAC AAOCCAACCO 2400 
GTATACaUVKS 6TGAG3^GACC TCTTCAAOCT XCCXAdV^TA eiGAAGTCTT iGCfCTAGTC 2460 
ACCCCTTTGT 7GCTTGACAA TCAGATCCTC AACACTACCC CTGCTGCTTC AAOTAflPTOAT 2520 
TCJBOCCTTGC ATGCTACOOC TGTATTTOCC AffPOTOGATG T6TCATTTGA ATCCATCCTG 25B0 
65 TCXTCCTATa AIGGTGCAOC TTTGCTTCCA TTTXOCTCTG CTTOCfTCAG TAOTGAATTG 2640 
TTTCSCCATC TGCAXACAOt TTCTCAAATC CITCCAOUUS TTftCXTCaQC XAXX3GA6AGT 2700 
GA"EAA(50TQG CCTEGCATGC TTCTCTOOCA GTGGCXG60S OTGAtTTGCT ATXAOAGCCC 2760 
AGCCTTGCTC AGTATTCIGA TGrc3CTGTCC ACTACTCATG CTGCTTCAflA GACGCTGOAA 2820 
TTTGC3TA6TG AATCTGGTGT TCTTTATAAA AGGCTTATGT TTTCTCAAGT TgAACCAOCX: 2880 
70 AGCAGTGATO CGATGATOCA TXSCAOGTTCT TCAGGG0CT3 AACCTTCTTA T GCCJ IGTCT 2940 
GMChhTOmaG QCTOOCAACA CATCTTCACT OTTTCITACA GTVCTQC31AT ACCTSTQGKE 3000 
GATTCTGTGO GIGTAACTTA TCAj3l3C3TWX: TTATTTAGOG GCOCTAGOCA TATACCAATA 3060 
CCrAAQTCTT CGTTAATAAC CCCAACTGCA TCATTAfTEGC AGCCTACTCA TGOOCTCTCT 3120 
GGTGATGOGG AATGGTCT(3G AGCCTCTTCT GATAGTGAAT TTCTTTTAOC TGRCACAGAT 3 IBO 
75 GGGCTGACA6 CCCTTAACAT TTCTTCAOCT GTXTCTGXAG CTQAATTXAC ATATACAACSk 3240 
TCI6TGTTTQ 6T6AT6ATAA TAAGOOaCTT TdAAAAOTO AAAXA»I»TA TGGAAATGAG 3300 
ACTGBUVCTGC AAATTCCTTC TTTCAATGAG ATGOTTiACC CTTCTBAAAG CACA GTCMQ 3360 
CCXaWZATQT ATGATAATOT AAATAAGTTS ARTGCGTdT TACAAGAAAC CTCT GtTTOC 3420 
ATTTCTAGCA CCAAGGGCAT GTTTOCAGGG TCOCTTGCTC ATA£DCA)CCRC TAAGOrmT 3480 
80 GATCATGAI3A TTAlBTCAAQX TCCAOAAAAT AACTTTTCA6 TTCAAOCTAC ACATACTGTC 3540 
TCXCAAGCAT dOGTGACAC TTCGCTTAAA CCIGTOCTTA GTGCRAACTC A OAGQC ftCCA 3600 
TGCTCTOAiCC CtGCTICXAS TGAAATQTIA TCTCCTTCAA CICMSCTCIT ATITSAT^O 3660 
ACCTCAGC^ CT T T TA gTAC TOAAGTATTO C1AC3UU3CTT CCTTTC3USGC TICTGKSGfSV 3720 
GACACCTTGC TXAAAACTGT TCTTCCAGCrr GaCGOOCftSTG ATGCAATAOT GGTTGMUkCC 3780 
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OCXaUkAJSTTC AIWAATTAS TTCTJM3AT6 TTGCATCTCA TTOTATCAAA TTCTGCTTCA 3840 
aGTGWUUVCA TGCTGCSiCTC TACATCTGTA tXavGTTTTTG ATGTGTC60C TACTTCTCAT 3900 
ATGCACrCTG CTTCftCTTCR AQGTTIXSACC ATTTCCTATG CRAGTGftOTA ATATGAACCA 3960 
GTTTT(3TTAA AftAGTGAARQ TTCCX3iCCnA GTGGTACCTT CTTTGTACAG TAATGATQAG 4020 
5 TIQTTCCRIA CGGCCAATTT GGAQAT'CAAC CyVGGCCCATC CCCCSAAWSG AAGGCATGTA 4080 

TTTGCTACAC CTGTTTTATC AATlGATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 
CATTCCSSATG AAATTTTAAC CTOCACCftAA AGTTCTGTTA CTGGTAAGGT ATTTaCTGGT 4200 
ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTOTTCX: TATAGGAAAT 4260 
GOGCATGTTO CCATTACAGC TGTTTCTCCC CACAGAQATG GTTCTGTAAC CTCRACAAaG 4320 
10 TTaCTGTTTC CTrCTAAGGC AACTTCIKAG CTORGTCATA GTGCCRAATC TOATGCCGGT 4300 
TTAGTGGGTB GIGGTGAAOA TGGTGACACT GATGATQA'CG GTOATGATGA TGATGRCW3A 4440 
GAXRGTGATG GCTTATCCAT TCATAAGTGT ATGTCATGCTr CKTOCXATAG AGAATCSiCAG 4500 
OAAAAGGTAA TGAATGATTC AGACACCCRC GAARACAGTC TTATGGRTCA GAATAATOCA 4S60 
ATCTCATACT CACTATCTGA GAATTCTGAA GAAGATAATA GAGTCACAW3 TGTATCCTCA 4S20 
15 onCajSTKMA CTGGTATGOA CASAAQTCCT QGTAAATCAC CATCRGCAAA TGGGCTATCC 4680 

CAAAASCACA ATQATOSUUV AGAOQAAAAT GACATTCW3A CTGGTASTGC TCTGCTTCCT 4740 
CTCAGCCCTQ AATCTAAAQC ATGGGCAiSTT CTGACAAGTG ATGAAjQAAAS tCGGATCAGGG 4800 
CAAQQTACXrr CAOATAGCCT TAATGAGAAT GAGACTTCCR CAGATTTCAG TTTT(3C3«3AC 4860 
ACTAATOAAA AAGATGCTOA TGGGATCClG GCAOCAGGTG ACTCAGAAAT AACTCCTGGA 4920 
20 TTCCCRCAGT CCCCAAChTC ATCTOTTACT AGCSftSAACT CRGAAGTQTT CC3WX5TTTCA 4980 
GAGGCAGAGG OCAGTAATAa TfUSOCATQAG TCTOGTATTG GTCTAGCXGA GOGGTTGGRA 5040 
TCOGAGAAGA AGGCAGTTAT ACqcCXTGTG ATOGTGTCaG COCTGACTTT TATC TOTCT A 5100 
GTQGTTCTTQ TQGGTATTCT CS^TCTACTGG AGGAAATGCT TCCAGACTGC ACACTTTTRC 5160 
TTAflftCGACA GTACavTCCOC TAGA6TTATA TCSCACACXZTC CAAO^CCTAT CTTTCCAATT 5220 
25 TC3«aVIGATG TOGGAGC31AT TCCAATAAAG CACTTTCCRA ASCATGTTQC AGATTTACAT £280 
GCAAGTAGTO GOTTTACTQA AQAATTTSAQ AC&CTOAAaG ACTTTXACCA GOftAQTGCftS 5340 
AGCTQTACTG TTGACTTAGG TATTACAGCA GACftfiCTCCA AOGACCCAGA CAACAAGCAC S400 
AAjGAATCGAT ACATAAATAT C5GTTGCCTAT GATCATAGCA QGOTTAAGCT AOCMAGCTT 5460 
GCTGAAAAGG ATOGCAfkACT GACTGATTAT ATCAATGCCA ATTATOTTGA TGGCTACAAC 5520 
30 AOACCAAARa CTTATATTGC TGCCCAAGGC CCACTGAAAT CC31CAGCTOA AOATTTCTGG SSOO 
AGAAI>GATAT GOQAACKTAA TGIGGAAGTT ACTGTCATSA TAACAAACCX OGTSGA OAftA 5640 
GSflAGGACSAA AAIGTOATCA GTACTGGCCT GCCQATGGGA GlGaOSAGTA OGGGAACTTT S700 
CTGGTCACTC AGAAOaGTGT GCAASTGCTT GCCTATTATA CTGTGRaaAA TTTTACTCTA 5760 
AGAAACACAA AAATAAAAAA GGGCTCCXM AAAGOTUVGAC CCAQTGGAOT TGTGGXCACA 5820 
35 CAOTAlCSUCr ACACGCAGTC GCCTGACATO GG2WSTAaCAS AGTACTOCCT GGCWJTGCTG 5880 

AGCrTXOTGft GAMJSGCAfiC CTAITGtXSUUS OGCCATGCAG TOOGGGCIGT TGTO STCCAC 5940 
TGCAIXTGCTG GhSTTQGMJS AAOVSGGACA TATATTGTGC: TAGACAGTAV GTrOCAGCAG 6000 
ASTCAACAOQ AAGGAACTOT CAACATATTT GGCTTCTTAA AACWCRTCOG TTCACAAflGA 6060 
AATTATTTGG TACAAACTGA CJOAGCAATAT GTCTTCATTC ATGATACRCT GGTTOAGGCG 6120 
40 ATACTTAOTA AAGAAACTGA G6TGCTQQAC ASTCATATTC ATGCCTATOT TAATGCACTC €160 
CTOVTincrS GACCAGCAGG CAAAACAAAG CTAGAGAAAC AATTCCAOCX CCTCM3CC&G 6240 
TCAAA.TATAC AfiCRGAOTSA CTATTCTOCA GCX3CTAAAGC AATGGAACA0 GGAAAAGAAT 6300 
CGAACTTCIT CTATCATCCC TGTGG2AA6A TCAAGQGTTG GCATTTCMC OCTGRGTGQA 6360 
GAAGSCACAG ACTAGATCSIA TGCCTOCTAT ATCRTGGGCT ATTACCAGAG CAATGAATTC 6420 
45 ATCATTACCXt AGCACCSCTCT CCTTCAITACC ATCRAQGATT TCXGGAQQAT fiATATGSQAG 6480 
CATAATGCOC AAGxaSTGGT ISkTCATPOCT GRTO6CCRAA ACWOaCflGA AGATMOTT 6540 
GTTTACT06C CAAATAAAGA TGAGCCTATA AATTGTGAGA GCTTTAAGQT CACTCnSUEQ 6600 
GCTGRAGAAC ACAAATGTCT ATCTAATGAG GRAAAACTTA TAATTCAGGA CTTrATCTTA 6660 
GAA0CTACAC AGGATOATTA TOTACTTGRA GTGAGGCACT TTCAGTOTCXJ TAAATGGCXav 6720 
50 AATCCAGATA GCC0CKT1»0 ttAAAACTTTT GAACirATAA GTGTXATAAA JWaAAGAAGCT 6780 
GCCAKIAGQG ATOGGQCTAr GATTerECAT GftlGftGCATQ GAGGAOTGAC GGCAGGAACT 6840 
TTCTOTGCTC TGftCAAOCCT TATGCACCAA CTAQAAAAAG AAAATXCOGT GQAWSTTTAC 6900 
CawSGTAGOCA AQATGATCAA TCIGATGAGG OCRGGAGTCT TTGCTGACAT TGftGCAOTAI 6960 
CaSTTTCTCT ACAAAGTGAT CCTCAaCCTT GTBAGCSftCAA GGCAGGAAGA GAATCCATCC 7020 
55 aCCTCrCXGO ACAGTAA3XSG TGCAGCATTG OCTGATGGAA ATATAQCTGA GRGCTTAGAG 70B0 
TCTTTIUSXTT AACACAOAAA QGQGTGGSGG GACTCACATC IGAGCATTCKC TTTCCTCTTC 714D 
CIAAAATTAS OCAOGAJUUT C3«SICTAGTr CTGTTATCTG TTGATTTCGC ATCACCTGAC 7200 
AOTAACraC ATGACATBGG ATTCTGCOGC CAAATTTATA TCATTAACRA TGtWTGCCTT 7260 
irrTGCAAGAC TTCTAATTTA CTTATTATCT TTGAACTAAA ATOATTGAAT TTTACAGTAT 7320 
60 TTCTAAGAAT OGAATTGIGG TATTTTTTTC TGTATTGATT TTAACWJAAA ATTTCRATTT 7380 
AinSAGUrXA GGAATTC3CAA ACTACAGAAA ATGTTTGTTT TTAlSTGrCAA ATTTTTAGCT 7440 
OlIAarTTGXAG CAATTAICAQ arrTGCTAOA AATATAACTT TTAATAOUSI AGCCXGTAAA 7500 
TAAAAC3WCTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 7560 
AAXAATCTGT TACrtATTGT ARATACTGCC CTAGTOTCTC CATGGACCAA ATTTATATTT 7620 
65 ATAftTTGiayS ATTTETATAT TTXACTACTG AGTCAAGTTT TCTAGTrCTG IGTAATTGTT 7680 
TftCSTTTftAXa ACGIAGTTCA TTaGCTGGTC TTACTCTAOC AOTTTTCIGA CATTOTATT6 7740 
TOTTRiMXaUk GTCATTAACT TTQTTTCaGC ATGTAATXrr ARCTmOM aRRRATOGAA 7B0O 
ATACCTTCAT TTTGRARjSAA GrTTTTATGA GRATAACSlCG ITACCAAACA TTOTTCSiAAT 7860 
QQTTTTTATC CAAG3AATTG CAAAAATAAA TAT3\ARTATT GGCAXTAAAA AAAAAZUkAAA 7920 
70 AAAARAAAAA AAAAAAAAAA A 

Seq ID HOs 382 Protein geguence 
Protedn Aracesslon #i np_002B42.1 

75 1 11 21 31 41 51 

1111)1 
MRimFLAC IQLI.CVCHLD WANGYYIiaQIl KbVBBlGWSY TOAUlQiKNWG KKYETCNSPK 
QSPmiDBDI. TQVNVNLKKI. KFQGWDKTSI. ENTPJHKTQK TVEHHiTSDY RVEGGVSEMV 
FFASKITPHH GKCWMSaCGS BH8I.BaaKFP I^EMQIYCFDA DHFSSPJEBAV RGKGKLBAIiS 
80 lIiFEVGTBEIff IiDFKAIIDGV ESSVfiBPGMQA ALDPFlLLNI* UPHSTMOnTI VMPSiyrfiPPC 
TDTVDHIVEK DTVSiaE6Q£> AVFCBVIiTHQ QSGITVMLMDY IXSEMFREQQY KFSRQVFSSV 
TGKEBIHEAV CSSBCENVQA DFBNXXSIAV TWBRPKWYD TMIEKPAVLY QQIiDQEDQTK 
HBFUXDGYOD LGAlIdnfLIiD ZMSYVK^IVA XCINGIiYGlor SDQLIVDKPT DNPEZJ3IiFPB 
lalGTEBIlKE EBEGKDIEEG HIVXtSGBDSh. THQISXKBPQ ZSTTIHYNRX dKXHEAKIH 
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RSPTRG8EP& QKanVPNTfil. MSTSQPVTKb ATSKDISL7S QTVTELPEHT VBGTSA8Lai)D 540 
GSKTVIfi9?H MtlI>86TAESIi KrVSlTEYBE BSLE.TSFKLD TGU^EDSSGSS PATSAIPFLS 600 
^ISQGVIPS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG lAVWFPSSTDl 660 
TAQPDfVCagaR ESFLQTNYTE IHVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720 
5 TBVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNQSTPLft PSYSSEVPPJj VTPIxUMnlQI 7B0 
LHTTPAAS5S DSAXiHATPVF PGVDVSFESl L&SXDGAPUi PFSSASFSSB LFKBLBTV8Q 840 
ILPQVISATE SDlCVPIrHASIi PVAQGDU.I.E P5IJlQYSE>VIi STTHRASETIf BFQSESGVLY 900 
KTLMFSQVEP PSSDAMMHAR SSGPEPSYAI, SDHB6SQMIF TVS^fSSAIPV HDSVGVTYQG 960 

SbFSGPSHIP IPXSSLlTPr ASbLQPXHAIi SGDGEHSOAS SDSEPHiPDT DGLTA3UUISS 1020 

10 PVeVJkEPTYT TSVFGDDMKR. I.SICSSIIYGN E^m<QIPSEN BMVYPSESTV HPNUXlSHVltK 1080 

LHASLQETSV SlS&TKfiMFP GSLAHTTTKV FDH&ISQVPB NKFSVQPTHT VSQ&SGDT8L 1140 

XPVIiSiU76EP AS8DPASSEH Ii9PGTQIiliFY CT&A£F8TEV I;LQPSFQA8D VDTIiEiKrVItP 1200 

AVPSDPILWB TPKVDKISSr MLHLIVSNSA SSEMMIiH3T9 VPVFDVSPTS HHHSABIiQGI* 1260 

T15YASEKYE PVLiJCSESSH gvVPSI,ySND EL^QTANIiEI WQAHPPKGRH VFATPVLSID 1320 

15 'BPhSSSTItlKBJi IHSDBILTST KSSVTGKVPA GIPTVASDTP VSTDHSVPIG N6RVAITAV3 1380 

WaSDGGVXSt KIpIiFPSKATS SLSH&AKSDA GXiVOOGBDGD ITSSDGDDVDD BDSDGIiSIHK 1440 

CMSCSSYRE9 QEKVMHDSDT HEHSLHDQNN P1S1£SLBESS EESSS^fTSVS SD&QTGMDRS 1500 

PGKSP&ANGL SOKHNDOKEB K9DIQT09AliIi PL8PE8KAWA VLTSDEEfiGS GQGTSDSUflB 1560 

IIETSTDPSFA DTKSKDADGI lAAQDSEITP GFPQSPTeSV TSENS2VFHV fiBAEASNSSH 1620 

20 BSRIBZASGb ESEKKAVZPIi VIVSALTFIC LWIiVGIIjIY nRKCFQTAHF YLEDaTSPHV 1680 

ISTPPTPXFP I5DDVGAIPI 1CHFPXHVADL HAS9GFTEEF BVIiKEFVQBV QSCTVDIAIT 1*740 

ADSSNHPDNK HKNRYINXVA YDHSRVKIAQ lABKDGKLTD YIHRNYVDCy HRPKAYIAAQ 1600 

GPXKSTAEDP HRMIHEUNVK vrVMlTHLVE KBHRKCDOVH PADGSEEYCTT FLVTQK8VQV 1860 

IiAYYTVRNFr XjBHTKIKKGfi QKGRPSGKW "rQYHYTQWPD KCFVPEYSLPV IiTFVEKAAYA 1920 

25 KRHAVGPWV HCSA6VGRTO TIZMUDBVajQ QlOHBO^TVlfl FGFUCHIRSO lUIYLVQTE^Q 19^0 

YVFIHDTLVB AZIiSKETBVI* DeHIHAYVHA LLIPGPWSCr KLSXQFQLL8 QSNlQQSDyS 2Q40 

AAUEQCSREiC HRT88IIPVB RSRVGIBSIWS GB07DYIKAS Y11K3YYQ5NB FirTQHPIiUI 2100 

TIKDFWRKIW BBZIAQLWMI PDGQMMREDE PVYHpHKDEP mCESFKVn* MAEBHKCLSir 2160 

EBKDIIQDPI IjEATUDDYVI. EVRHFQCPKW PNPDSPISKT PBLISVIKEE AANRDQPMIV 2220 

30 SDEB)SX3VTAa TFCALTTLMB QZ>EKEEISVDV YOVAXHUOiM RPGVFADIEQ YQFLYKVILS 22 BO 
lATSTRQEENP STSU^SBQAA LPDGtTCAESIi SSLV 



Seq ID HQs 383 PMA secniencq 
Nucleic Acid Acceasion 41 1 1IM_^005«8B .1 
35 Coding sequences 126.. 4439 

1 11 21 31 41 51 

CCQGGCAGGT GGCTCATQCT QQQGAGCGTQ ffTTGAGOGGC TQGOGCGGTT GTCCTCJOAGC 
40 ASGGGCGCM OAATTCTGAT GTOAftACTAA CftGTCrGTGA GCCCTGGAAC CTCOGCTCAO 
AGAAGATGAA GGATATCOAC ATASGAAAAQ ABTATATCAT OCSCCAGTOCT GGGTATA^AA 
GTGTGAGGGA OAGAACCAIGC ACTTClGGGA. OGCACAOAGA CCXSTGMlGAT TGC3^TTCA 
OQAaA&CTCG AOOGTTOGAA TGCC&AGATQ CX?ITGGARAC AGCIU3C00GA GCOaaGOQCC 

TCTCXcrraA tqcctccatg cattctcagc tcagratoct ggatgaggag c^tcccaagg 

45 QAAAGTACCA TCATGGCXTC? AGTGCTCTGA AQCCCATOCG GACTACITCX! AAACACCAOC 
AOCCftGTGGA CAATflClGGG CTTTTTTCCaT GIOTQACTTT TTOGTQGCTT TCTTCTCTOG 
C(X3BTGTGGC CCACAAQAAO GGOGAGCTCT CAATGGAASA CGTGTGGTCr CTGTOCAAGC 
AOGACTCTTC TGADGTGAAC TGOVGAAGAC TAGRQAGACT eiGGCAaGAA QAGCTGAATG 
AAGTTGGGCC AGACGCTGCT TGOCTGOQAA GGCrTGTGTG QATCTTClGC OGCACCACIGC! 
50 TCATCCTOTC eATGGTGTGC CTGATGATCA CGCTGCTQGC TGGCTTCAGT QGACCAGCCT 
TC&TG6IGAA ACACCTCTT<3 GftSTATACCC ABGCAaCWBft qT CTAftO CTG GAGrftCAOCT 
TSTTSmuST GCTGGGOCTC CTC3CTGACGG AAATOSnSOS GTCTTGOTCG 
CTTGGGCATT GAATTRCOtSA ACCGGTQTCC GCTTGCGGGG GGCCATCCTA 
TTAAGAAGAT CCTTAAGTTA AASAA CATTA AAOAfiAAATC OCTGOOTGAG 
55 TTTGCrCCAA CGATGGGCAG AOAATQTTTG AGGCAGCTGC OGrTGGCAOC 
6AO(3AlC3C3CGT TGTTGCCATC TTAGGCATQA TTTAXAKICT AATTAmnX? 
OCTTOCIGGQ ATO^QCIGTT TTTATGCTCT TTTAOOCAGC AATGKICTTT 
TCACAGCATA TTTCASGAQA AAATGCGTGG CCBCCmOBGK TGAAOQTOtC 
^ ATGAASTTCr TACTHWZATT AAATTTATCA AAATGTATGC CIGOTTCAAA 

60 AGACSXGTTCA AAAAATOGGC GftGGAGGAGC G^CX^SATATT GQAAAAAGCC 
AGGGTATCAC: TefGGGIGTG GCTOSCASTO TGQXGGXGAT TGGCAGO(?rQ 
CTGTTCATAT GACCCTGGGC TTGGATCTGA CAGCmSCACA GGCTTTCACA 
TCTTCAATTC CATGACTTTT GCTTTGAAAO TARCRCOGTT TTCAGTAAAG 
AAGCCTCW3T GGCIGTTGAC AGATTTAAJGA GTTTGTTTCT AATGGAAGAG 
65 TAAAGAACAA ACCAGCCAOT CCTCACATCA AGAXAGAGAT GAA A AATGOC 
GGGACTCCTC CCawCTCCZ^T ATOCAOAACT GGCGCAaQCT GMXXXCTAA 
ACAAlSAGGSC TTOCAGGGGC AAGAAAGABA AGGTGAGGCA GCT6CAGOGC 
AGGOGGTOCT SGCftGAGCAG AAAGGCCACC TOCTCCTQGA CAGTGA0GA3 
CCGAAGASGA AGAAGGCRAG CACATCCACC TOGGCCACCT GC3GCTTACaM3 
70 AlCAGGATOGA TCIGGAGATC GAAGAiGGQTA AACIGGTTGG AA!IC£GOGGC 
GXGGAAAAAC: CTCTGTCATT TCAGOCATTT TAGGCXSGAT 
TTGCAATCAG TGGAACCZTC GCTTAXCZiGG OCCM3C3U3GC 
TGAGAGACAA CATGCTGTTT GGGAAGGAAT ATGATGAAGA 
ACACtCTGCIG OCTOAGGCCT GAOCTGGCC3V TTCTTCCCA0 
75 GAGAGCGAGS AGCCAACCTG AGCGQTGGGC AGGGCCSGAG 
IGTATAGIGA CAGGAGCATC TACATCCTGG ACGAGOCCCT 
TGGGCAACCA CATCTTCAAT AGTGCTATCC GSAAACKTCT 
TTGTTAOCCA CCAOTTACAG TACCTGGTTQ ACTGTGATGA 
GCTGTATTAC GGAAAGftiOGC ACCCRTGAGG AACTGAIGAA 
80 CCATTTTTAA TAACCTGITG CTGGOAGnfiA CACCGC3CAGT 
AAACCAGIGG TTCACAfiAAO AAGTCACAAO ACAAGGGTCC 
AGGAAAAAGC AlGTAAAGCCA GAGSAAGGGC ABCTTOTQCA 
OTTCAGTGOC CTGGTCAGTA ^rATGOTaTCT ACATGCM3GC 
TOCTGGTZAT TAIGGCXTCTT TTCATGCTOA ATGTAGGCAG 



CrGGRTCCTC 
AAOATAOUU:: 
CAGGGACCIG 
GATCAOCCTT 
CAfSTGGCTTA 
CAAGTCCRAG 
AGTGATCITC 
TTTA&ATG6T 
TSAGATCAAT 
TAAAACAG6A 
GCXOGAAGAO 
TGGXGGGGGC 
CACOGQCZTC 



ACCAaiGGCAT 
CTCATCAACA 
CTQCTGGCIX3 
GGACCAACAG 
GCATCACGGC 
CAGAAGATGA 
GOVrTTTCTC 
GGGTACTTCC 
GTGACCTICT 
GTGGTQACAjG 
TOCCTCTCAG 
GTTCAQWIGA 
ACCTTGG^T 
ATOAAAAAAG 
ACIGAGCATC 
GGGCGCAGTC 
AOGACACTGC 
AGTGTGGGAA 
GAGGGCftOCA 
AATGCTACTC 
TGTGTGCTGA 
AOGGAGATTG 
GOCGGGGCCT 
GATGCCX^ZVXG 
ACAGTTCTGT 
ATOAAAGAGG 
SACTATGCTA 
TCTUU^AAAGG 
TCAOTAAAQA 
AAAGGGCAGG 
CCCTTGGCAT 
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6GTTGAGTTA. CTGGATCAAG CRAGGAAGCG G(3AACRiCCAC TGiaACTCJGA 2820 
CCTCQGTGAG TGACAfiCftTG PAGBKOHIiTC CTCATATGCA GTACTATGCX: AGCa TCTROQ 2B80 
CCCrCirOCAT OSCAGTCATG CTCavrOCTGA AAQCCATTCQ AGGAOTTGTC TTTGTCAAGG 2940 
GCftOGCTGCG AGCTTCCTCC CQCSCTQCATQ ACGAaCTTTT CCGAAGGATC CTTCGAAQCC 30 DO 
5 CTATGAASTT tTTTGACACG ACCCCCAOVG GG2VSGATTCX CAACAQGXTT TCCAAAGA^ 3060 
TGGATGAAOT TGACGTGCGG CTGCCQTTCC AGGCXXASAT GTTCATO Caa AACSTTATOC 3120 
TGGTGTTCTT Cn^TGTGOQA ATGATCGCAG GAGTCTTCCC GTGQTTCXrrr GTGQCftGTGe 3180 ' 
GGCOCCITGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTCGGGAGC 3240 
TGAAOaSTCT GGACAAl^ATC ACBCAC3TCAC CTTTCCTCTC CCACATCAC6 TCCftGCATAC 3300 
10 AGGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA GTTTCTGCAC APATACCAGG 3360 
AjGCTGCTGOA TGACRAOCRA GCTCCTTTTT TTTaSTTTAC GTGTGOSATQ CGGTGGCTGG 3420 
dGTGOQGCT GGACCTCATC AGCAT€SCCX; TCATCACCAC CAOiGGGCEG ATGATOiTTC 3dB0 
TTArroCAOGQ GC9U3A1TCCC CXaVQCCTATG CGGGTCTOGC CATCTCTTAT QCTGTCCAGT 3540 
TAACGGGGCT GTTCCAGTTT ACOGTCAGAC TGQCATCTGA GACAC3AAGCT OGATTCACCT 3600 
15 CGGTCSOABAQ GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT QCCAGAATTA 3660 
AGAACAAGGC TCCCTCCXXTT GACTGGCCCC AGGRC3GGAGA GGTQACCTTT GAGAACQCAG 3720 
AGATOAOCSTa CJCGAGAAAAC CTCM^CTCTTG TCCTAAAOAA AGTATCCTTC ACGATCAAAC 37S0 
C£AAAGAGA& GATTOGCMT GTGGGGOGGA CRGGATCAGG GAAGTCCTCG CrGGGOATGQ 3840 
CCCrCTTCCB TCIG6TGGA0 TTATCTGGaG GCTGCATCAA GATTGATGQA GTGAGAATCA 3900 
20 OTGATA-ntSG OCTTCCCBAC CTOOGAAGCA AAOTCTCTAT CATTCCTtSUl GAfiCCGOTOC 3960 
TGTTCa^GTGQ CACTGTCAGA TCAAATTTGG ACCXKTTCAA CXMrcACACT OAMSACCAGA 402O 
TTTGGGATGC CCTGGAGflGG ACACACATGA AAOAAMTAT TGCTCAQCTA CCTCTGAAAC 4080 
TTGAATCTBA AGTGATGGA0 AATQGGGATA ACTTCTOWSr GGGGGAAOGO CASCTCTTGT 4140 
GCATAGCTAG AGCCCTGCTC CX3CCACTGTA AtSATTCTGAT TTTAOATGAA GCCACAQCTG 4200 
25 CCATOQACaC AGAGACAGAC TTATK3ATTC AAGAQACZCAT CCX3AGAATCA TTTGCAG&CT 4260 
GTAOCATGCT GAC3CATTGCC CATOGCJCTGC ACAGGGTTCT AGGCTCCGRT ABOATTATG6 4320 
TGCTOGCOC& OOGACAGOra GTGGAGTTTG ACACCCC3CTC QGTCCTTCTG TCCAACGACA 4380 
GTTCOOGATT CTAarGCGATG TTTOCTGCTG Ca«3AGAACAA OGTCGCXGTC AAjCSGGCTGAC 4440 
TCCICGCIGT TGACOAAGTC TCTTTTCTTT AtaAQCATTGC CATTCX3CTGC CTGGQGOaQG 4S00 
30 CCCCXCATOG C6TCCTOCTA CCQAAACCTT GOCTTTCTCB ATTTTATCTT TCGCACAGCA 4560 
GTTCOGGATT GGCTTC5TGTG TTTCTVCTTTT AiSGCSAGAGTC ATATTTTGAT TATTGTATTT 4620 
ATTCCATATT CATGTAAACA AAATTTAaSTT TTTQTTCTTA ATTQCACTCT AAAAGGTTCA 46B0 
GGQAACCGTT ATTATAATTG O^ATCAGAGGC CTATAATGAA GCTTTATACG TGTA GCTATA 4740 
TGTATATATA ATTCTGTACA TAGCCTATAT TTACAGTQAA AATGTAAGCT OTTTAITTTA 4 BOO 
35 TATTAAAATA AGCACTOTGC TAATAACRGT GCaTATTCCT TTCXATCATT TTTGXACAGT 4860 
TTCCTOTACT AGAGATCTGG TTTTQCTATT AGftCTGTAiaa AAGA6TAOCA TTTCAtTCTT 4920 
CICTAGCIGG TCGTTTGACX3 GTGGGAGGTT TTCTGGGTGT OCAAAGGAAG AOSTC5TGGCA 4980 
ATA0TGQGCC CTCCGACflGC CX!0CTCTGOC GOCTCXXCTC AfiCOGCTCCR GQQGTGGCT6 5040 
GAGAGGGGTG GGCGOCTGGA GACCATGCAG AGCBCOGTGA GTTCTtaGGG CTCCTQCCTT 5100 
40 CTCTCCTOC3T QTCACTTACT GTTTCTGTCA GGAGAGCAOC QGGGOGftABC OcaQGDCCCT S160 
TTTCACTCCC TCCATCAASA ATG066ATCA CAGAG?UaTr OCICOSASCC GGGGAOTTTC 5220 
TTTCCTQCCT TC T ' I CTTTTT GCTOTTGTTT CTAAACAAGA ATCAGTCTAT CCfiCAGRGAG 5280 
TCGCACXGCC TCA0QTT{3CT A^GGGTGGCC ACTGCACAGA GCTCIXX3iGC TOGAAGACCT 5340 
GTTGGTTCCA AGCCCTGGAG CCAACTOCIG CTTTTTGACJG TGGCACTTTT TCATTT GCCT 5400 
45 ATTCCCau^aC CTGCACAGTT CAGTGSCAGG GCTCAGOftTT TCGTGGGTCr eTTTTCCTTT 5460 
CTCACCGCAG TCQTGQCRCA GTCTCTCTCT CTCTCTOCCC TCAAACTCTG CAACTTTAAG 5520 
CA0CXCTTGC TAATCAGTOT CTCaCACTGG CGTAOAAGTT TTTGTACTQT AAAGM^CT 55B0 
ACCTCK3GTT GCTeexTGCT GTOTGOTTTQ 6TGTGTTCCC QCAAAOCCCC TTTOTGCIGT 5640 
GGGGCTGGTA GCTCAGGTOG GOSTGSTCAC TGCTGTC3VTC AGTTQAATGG TCAGCGTTOC S700 
50 ATOTOTrOAG CAACfAGACA TTCTOTOSDC XTAGCATQTT TGCTGAACAC CTTQTGGARG 5760 
CAAAAATCTG ARAATOTQAA TAAAATTKTT TTSGAlTTTG TAAAAAAAAA AAAAAAAAAA 5820 
AAAARAAAAA AAAAAAAA 

Seg ID SflO: 384 Protean aequencB 
55 Protein AcceoHion fts KPj005G79,l 

1 11 21 31 41 51 

I i 1 I I I 

KKDIDIGKKY IIPSI>GYRSV RBRTSTSQTa BDREDSKFRR TUPLBCQiDaL ETAAEAEGIjS SO 

60 IiDA£HUfiOLH IXiDEEBPKGK ^CflHGLSAIiSP IRTTSKaiQEIP VDNAGUFBCM TFSHLSSLAR 120 

VAHKKGBLSM EITVHSLSKHB SSDVSlCRXlIiB 1UiWQBEL3IE7 OPDAASLBSV VKZFCRIRLI 180 

X.61VCLMITQ BAGFSOPAfM VKHIiliEYTaA TESULQYSLL XiVI^aUiXirEI VRSnSLAIiTH 240 

AiaffYETOVRL RGAXLTKAFK KIIKLWUKE KSLGBLINIC S29DQQRHFEA AAVGSLLAGG 300 

^ PWAILGHiy WVIIIiGPTGP LGSAVFILPSf PAMHPASRLT ATPRRKCVAA TDBKWQyMNB 360 

65 VLTyiKPIKM YAHVXAPSQS VQKIREEERR ILBKBOSTFOG ITVGVAPIW VIA6WTFSV 420 

HMTJ:iG3PDI»TR AQAFTWTVF MBMTFAIiKVT PFSVIC8L8EA SVAVDRF3C8(b ggiWE HmMIK 480 

inCFASPBIKI E^fKNATllAHD 8SK881QNSP KbTFIMKKDK RASRGKKEKV RQIiQRTEEIQA 540 

VLABQKQHDIj liDSDBRPSPE SEEGKHIHIiO HIiRLQRTLUS TDtEiElQBGKIi VGIOGSVGSG 600 

KTSliIfiAltG Cfitthl^BGBJA XSGTPAWftQ QAHUiHATXtlt IIiaLPGFKH> EERWISVLNS 660 

70 CCLTtPI>DAXI« PSSDIiTBlGB RGAKI*aGGQS QRISUUlAIiY BPRSI YIUD PLSAIJQAEVa 720 

EIHIF£t&AIRK HIiKSKTVLFV THQLQYIjVtX: DEVIFKKBGC ITERGTHBBIi MNlJgGDYATI 780 

FKNIibI/3BTP PVSIN5KKET S6SQKKSQDK GFKTQSVXKE KAVKFEBGQL VQLEEIQQQGS 840 

VPtfSVYGVYI QAAGGPIAFIr VJKKUFVOJSV GSTAFSTITHI. SyfflKQGSGn TTVTRC3METS 900 

VSDSMKEKPH MOfYYASZYAIa SMAVMLIU6CA IRGWFVKGT IjKASSRIiHDB LFRRIUiaPM 960 

75 KFFDTTPTI31 UjNRFSKDMD EVDVRIiPFQA EMPXQNVIIiV FPCV7GMIAGV FFHFLVAVGP 1020 

I,VIIiP£VI£I VSaVLIHEIilC IlUQNZTQSFP I^HITiSSZQG IiATZHA%KXIG QEFIiHRXaEIi 1080 

LDDiaQAPFFIr FTCMRHLAV RIiDLISIALt TTTOXlflVLH EGQIPPAIEAS IiAISYAVQEkT 1X40 

GLFQPTVRLA SBTBARFTSV EMNHYITCTL fiLEAPARIKW KAPBPDWPQE GBVTFKtlAHfl 1200 

RynSHIfPIjVIi KKVeFTIKPK EKIQXVGRTG SOKBSLGMAL FRliVELSOGC IKlDGVRISD 1260 

80 IGLADUlSKIi 61IPQEPVLP &GTVR5NU)P RiIQyTE[3QIK OAItERTBMKB CIAQI*PIiKLB 1320 

SBVMENGDHP SVGERQUiCI ABKLUlHCia IiIUJiSATAKH DTETDlLIiI{]iS TXREAFADCr 1380 
MEiTZAHRIOT VlfiSDSZKVIj AQGQWBFDT PSVIiIiaNDSS RFYAHBAAAB NKVAVXG 
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Seq ID NO: 3B5 DMR oequence 

nucleic Acid Accession #: NW_ooi32'7-i 

Oodang sequences 

I 11 21 31 Al 51 

I I I I ) I 

AGCAGQGGGC GCTGT6TGTA COSAGAATAC CA6AATACCT CGTGGGtXJCT GACCTTCTCT 60 

CTQAGAGCCX3 GGCA(SRG<3CT CCGGAQCCAT GCAGGCCGAA GGCOQGQGCA CAGGGGGTTC 120 

GAOGGGCGAT GCTSATOGOC CACBSAGGCCC TOaCATTCCT GATGGOCCAQ GGGGCAATGC IBO 

TGOCGGCCCA GGAGAGGCQG GTGCCACQQQ CGGCAGAGOT CCOOGOOGGG CAGGGOCAGC 240 

AAGGGCCTC3G GGGCCGGGAG GAGQCGCCCC GCGGGGTCOG CMt3GOGGCX5 CXSGCTTCAOa 300 

GCTQAATGG& TGCTGCAQAT GOSGGGCCAa GGOGCOGGAB AGOGGCCTQC TTGftGTTCTA 360 

CCTOSCCATG CCTTTCGCGA CACCCATGGA AGCAC3AfiCTG GCCCGCAGGA GCCTOaCCCA 420 

GGATGCCCX3V CCGCTTCCOO TGCCAQGCSQT GCTTCTGAAG GAeTTCACIG TGTOCGGCAA 480 

CRTACTGACT ATCCGACTGA CTOCTGCSiGR CCACC3GCCAA CTOCAGCTCT CCATCAfiCXC 540 

CTOTCTCCAG CAGiTTTTCCC -£GTT6ftTQTG GS^TCftOGGAG TGCTTTCTGC C3CGTGTTTTT 600 

(KaCTCAGCCT CCCTCAGGGC AOAGGCXSCIA AGCCCSUsCCT GGOGCCCXTTT CCXAGGTCAT 6£0 

€OCrCCTCX:C CTAOGQAAIC 6TGCCAC3CAC GAGTOQCCftB TTCATEGTOS GGGCSCTGATT 720 
GTTTGTOGCT GGAGGA0GAC CaCTTACMG TTTQTTTCTG TUaAAAMAA 

Seq ID ISSOi 3B6 Protein sgguence 
ProteliL AcceBslcm np^00131B.1 

1 11 21 31 41 SI 

HQAE6R6TGG BTCJDADGPGG PGXPD0FGGH AOGPOBAiQAT GGRSPROAGA ARASGPOGOA 60 
PKGPHOGAA5 GIiKGOCRCX3A BGPEBEOiIiBF YliAMPFATPM SABLASRSIjA QDA&PEkPVFG 120 
VXiIiKBFTVSG NIIiTlRLTAA DERQIiQIiSIS SCLQQjL&LLII WJTQCPESVP LAQPPGGCJIBR 

fieq ID NO: 3a7 DNA gequence 

Nucleic Acid Accession Bob sequence 

coding sequences 52.. 459 

I 

1 IX 21 31 41 51 

CCTCGIGQGC CCTQACCTTC TCTCTGAGflG GCOGGCAGAG GCICC6GAGC CATQCaGGGC 60 

GAAGGCCftGG GCACAGGGGG TTCGA0GGC3C GATGCTOATG GCOCAjSGASa CCCTGGCATT 120 

CCIGATOGOC CAfiGGOGCAA TBCTGGOQGC CCAGGAGAGG GGQQTGGC3VC GGOCXSQCAGA ISO 

OaTCCCCXSGG GCGCSGGGQC AGCAAi3t3GC3C TOGGGGCOGA GRGGAGGOQC CCCGCGGGGT 240 

COGCMXWOG GTGCaSCTTC TOCQCAGCSAT GQAfleGTGCC CCTGCGGGGC CAGGAOGCCjG 300 

GACA0CCX3CC TGCrTCaOTT GOGACTQACr GCTGCAlQACC AOGGGCAACT GC3U9CTCTOC 360 

ATCAGCTCCT OTCTCCftGCR GCTTTOOCTG TTOATStGGA TCACaCAGTG CTTTCTGCOC 420 

GTOTTTTTGG CTCAOGCTCC CTC3^0GGCAG AGGOGCTAAG CCCAGCKTQS CGOCCCTTCC 4S0 

XAGGTCATGC CrOCTCCCXn? AQGQAATGGT CCCAGCTkCGA GTOaOCAGTT GATTGlGGGG 540 

GCCTGATTQT T1GT06CTGG ASGAGGAGGG CTTACATGIT TQTTTCTGTR OAAAAO^AAAG 600 
CTGAGCTA 

fieq ID NOs 38B Protein eeguence 
Protein Aoceesian #i Boa sequence 

X 11 21 31 41 51 

MO&SGQGTGG SXGCASGPGG VGIPDQPCaQI JUSCaPOBAQAT OGRGPRGAOA A&ASGESGQA 60 
FRGPHGGAAS AQDGRCPOGA SSPDSRIiLOF HUEAADHRQL QLSIfifiCLQQ XiSIOHlfXTQC 120 

Seq ED HO: 389 TJHA. sequence 
nucleic Ada AcceBBlan ft: iaM_005562.1 
Coding sequence i 90.. 3671 

1 11 21 31 41 51 

ACftGCGGAOC GOiSAGTGAG AAOCACCaWiC C3OIW3GOQC00 GGCAGCGACC CCTG(2ASC3GG 60 

AOACAGAGAC TGAQOGGCCC GGCACOBCCA TGCXTPGCQCT CTGGCPGaGC TGCTGDCTCT 120 

GCTTCTCaCT CCTOCTGCXX: GCAGCCOGGG CXSWCKTOCAO GAfiGGftAGTC TGTGATTGCA 180 

ATGGGAAGTC CAQGCAGTST ATCTT*rGA:rC GGGIUkCTTCA CMSRCMMCX GGTAATGOaiT 240 

XCXSSCTQCCT CAACTGCaUCT GACAACftCIO A1GGC«TTCT> dGGGAGAAO TGCAAfiAftTG 300 

SCTTTTACOa GGACAGAGAA AOGGACOGCT GTTTOCCCTG CAATTOTAAC TOCAAAOTrT 360 

CTCTTAGTQC *rGGATGTGAC AACTOTGGAC GGTGCftGCTG TAAAOCftGGT GTGACAOGAG 420 

OCAGATGOGA COOATQTCTG CCAGGCITCC ACATGCTCAC GGATGGGGOO TGCACCXaAO 460 

ACCAGAGACT GCXAGACTCC AASIOXGACT GTSACCX3USC TGGCATOSCA GGGC20CTGIG 540 

ACGOGGQCOG CTGTOTCTQC AAGCCAGCTG raurrGGAGA ACX3CIGTGAT AGGTGTOQAT 600 

CAGGTTACTA TAATCTGGAT GGOQSSAACC CTQAGGGCT6 TACCCa^TGT TTCTQCIATG 660 

GQCATTCAGC CAGCTQCCQC AGCTCTGC3«3 AATACAGTGT CCATAAGATC ACXTCCTACCT 720 

TTCATCAAGA TGXTGATGGC TGGAAGGCTG TOCAAlQGAftA TQBGTCarCXIT GCAAAGC TCC 780 

AATGGICACA. GOQCCATCAA GAmTGTTTA GCTCAGCX3CA AOSS^CIAGAC CCPCTCTATT BAO 

TTGTGGCTCX! TGOCAAATTT CTTGGGAATC AACSbSGTGAG CTATGGGCAA AGCCTGrOCT 900 

TTGACTAOCG TOTQOAlCAGA GGAGGOU3AC AOCCATCTQC CCATGATGTG ATTCTGGAAG 960 

QTGCTGGTCT AOGGATCACA GCTCOCTTGA TGCCACTTGG CAAQACACTG CCTTGTGGGC 1020 

TCACCAAGAC TXACACATITC AGGTTAAAI6 AGCATOCAAG GAATAATTG6 A6CCC0C3\GC 1080 

axSAGITAClX TGMTATGQA AGGTTACTOC QGMTCKX AeOCCTCOGC ATtXaGftGCTA 1140 

CftTATGeAGA JOnCRGIACT GGGTACATTG ACAAIGTGAC OCTGATTTGA GCOOOODCTG 1200 

TCTCIGGAOC COCAGCAOCC TGGGIT6AAC AGXGTATAIG TCCl&iTGGG TACAAGGBBC 1260 

AATTCTGOCA GGMCTGTrGCT TCTGGCXACA AGMS^GATTC AGGGAaAClG GGGGCTITTG 1320 
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GCAOCTCXAT TCCTTQTAAC TGTCAAGOOG GAGOGGOCXQ TGATCCftGAC ACAGGAGATT 13 BO 
GTTATTChGS tSGaVTGiAGAAT CCTC3RCATTG AffltSTGCtGA CTGOCCAATT aSTTTCTACA 1440 
ACQATCOGCA CJtSACCKCCGC AGCTQCAAGC CATGTCCCTG TCATAAOGGG TTCJVSCTGCT IS 00 
CftGOXaAlXSCC QC5AGACGGAG GAGGTGGTGT GCAATAACTG CCCTCCCGOG <JICACXS3G1^3 1560 
CCX3GCTGTGA CCTCTQTGCT GJVTGGCTACr TTQOGGACCC CTTTGGTGAA CATGGCCCAG 1620 
TGAGGCCTTG TCAGCCCTGT CAATGCAACA ACAATGTGGA CCCCAGTGCC TCTGGOAATT 1680 
GTGACCX3GCT GACACQCAGG TGTTTGAAGT GTATCCACAA C3\GAjGCCaQC ATCIACTOGG 1740 
ACCAGTGCAA AGCAGGCTAC TTCGGGGACC CATTGGCTCC CAACCC3\fiCA GACAA6TGTC IfiOO 
GAGCITGCJVA CTGTAACCCC ATGGGCTCAG AGGCTGrAGG ATGTOGAAGT GATGGCACXn: 1860 
GTGTTTQCAA QCCAGSATTT QQTGGCCCCA ACTGTGAGCA TGGAGCATTC AGC^TGTGCAG 1920 
CIIGCTATAA TCAAfiTGAAS ATTCAGATGG ATCAOTTPAT GCRGCA6CTT CAGA0AATG6 1980 
AGGCCCTOAT TTCAAftGGCT CABGOTOaTQ AXGGAGTAGT AOCTGATACA GAGCTGGAAG 3040 
GCAGGATGCA GCAGGCTGRG CAGGCCCTTC AGGACATIGT GAQAGAlGOC CAGATTTCAG 2100 
AAGGTGCTAG CAQATCCCTr GGTCTCCAGT TGGCCAAGQT GAGGAGCCAA QAOAACAGCT 2160 
ACCAGAGOCG CCTGGATGAC CIOAGATGA CTGTOGAAAG AGTTCSeGGCr CTGGGAAGTC 2220 
AGTACCAGAA CCGaGlrTCJOG GATACrC3U2A GGCTCATCAC TCftGATGCftG CTQZUSCCTGG 22 BO 
CAlGAAAGTGA AGCTTCX^TTQ GGAAACACTA ACATTOCIGC CTCAGACCAC TftCGTOGGGC 2340 
CAAATOaCIT TAAftAGTCTG GCTCAGGAGG CCACAA6ATT ASCAGAAAGC CAC6TTGAGT 2400 
CAGGCAGTAA CATGQAGCRA CrGAC3W«3Ga AAACTGAGGA CTATTCCAAA CAAGOOCTCT 2460 
CACTGGTGCG CAAGGCCCTG CATGAAOOAG TCGGAAGCGG AAGCGGTAGC COGG&CXaGlG 2520 
CTGTGGTGCA AGGGCTTGTQ GAAAAATTOO AQAAAACCAA CTCQCTOGQC CAGCAGT1X3A 2560 
CAAGGGAQGC CACTCAAGOG GAAATTQAAG CRGATAGGTC TTATCAGCAC AQTCTCOGCC 3fi4D 
TCCTQGATTC AGTGTCTCOa CTTCAGOGAQ TCAGTGATCA GTCCTTTCAG GTGGAfiOAAQ 2700 
CAAAQAOOAT CAAACAAAAA GOGQAT^TCAC TCTCAAOJCrr GGTAACCAGG C3VEATGGATG 27fiD 
AGTTCAAGCG TACACAAAAG AATCrGGGAA ACCGGAAASA AGAAGCACA6 CAOCTCTTAC 2820 
AGAATGGAAA AA6TGGGAGA GAGAAATCMS ATCAGCTQCI TTCCCOTQCC AATGTTGCIA 2880 
AAAGCSGAGC ACAAGAAiSCA CTGAQTATBG GCAATGCCAC TTTTTATGAA (TTTOftGAQCA 2940 
TCCTTAAAAA CCTC«3AGAQ TlTIGAOCTGC AGGTGGACAA CAGAAAAiBCA GAAGCTGAAG 3000 
AAGCCATGAA GAGAClCTCC TACATCA6CC AGAAGGTTTC AGATGCCAGT GACRAGACCC 3060 
AGCAAGCAGA AAGAGCCCTG GGGAGCGCTTG CTGCIGATGC ACAGAlSGGCA AAGAAT OQGQ 3120 
COGGGGAGGC CCTGGAAATC TCCAGTGAGA TTGAAauaSA GATTGG8AOT CTGhACTTGG 3180 
AAQCCAATGT GACAQCAGAT GGAGCCTTGG CCATGGAAAA OQCaACTOGCX: TCTCTQAAiQA 3240 
GTGAGATCJAG QGAAGTGGAA GGAGASCTGS AAAGGAAGGA GCrrGGAGTTT GACAC6AAXA 33 OO 
TGQATGCAGT ACAGATQGTO ATTACAGAAG CCCAfiAAGGT TGATACCAGA GOCRAGAACG S360 
CTOQGGTTAC AAlOCAAGAC ACACT(2AAGA CATTAGAOGQ CCTCXTEGCAT CTGATGGACC 3420 
AGCCTCIICAa TGTAaATGAA GAGGGGCTGG TCTTACTGGA GCAGAAQCXT TGGGCSIU3CCA 3480 
AGAOCCAGAT CAACAGGCAA CTGCQGCCX2A TGAIOXCAGA GCTSGAAGAG AGGGCACGTC 3540 
ASCAGa^QQG CCACCTCCAT TTGCTGGAGA CAAGCATAGA TGOGATTCTG GCTGATGTGA 3600 
AGAACTTOaA GAACATTAGG GACAACCIGC CCCCAGGCTQ CTACAATACC CAOGCrCTTG 3660 
AGCaiACAGTG AAGCPGCCAT AAATATTTCT OlACTGAGGT TCTTGGORTA C aGAT CTOWG 3720 
GGCTOgOGAG CCATGTCAT6 TGABiaaG^rG GGATGOGOnC ATTTGAACAT OTTEAATGGG 3780 
TATeCTCAQS TCAACTGAOC tCGaWICOCATT CCTGATCCCA TGGCC»SGTG GTTGTCTTAT 3840 
TGCACCATAC TCCTTCGCTTC CTGAaOCrGG GCAATGAGGC AGAIAGCACT GOGTGTGASA 3900 
ATQATCStfiGG ATCTGOACCC CAAAGAATAG ACTOQAltSGA AAGACAAACT GCACAGGCAO 3960 
ATGTITGCCT CAZAATAGTC GTAAGTGGAS TCCrTGGAATT TGGACAAGTG CIO TTGGGAl' 4020 
ATAiGTCAACr TATTCTTTGA OTAnaCGIGAC TAAAGOAAAA AACTTT8ACI TIGCXX»G6C 4000 
ATGAAATTCT TCCTAaJGOrC AGAAC»GAt3T GCAACCCAGT CACftCTGTGG CCAOTAAAAT 4140 
AJCTPATMCCT CATRTTGTC3C TCTGCAAGCT TCTTGCTGAT CAGAGTTCCT CXTPACTTACA 4200 
ACCCAGGGTa TGAACAO^GTT CTCCATTTTC AASCTGGAAQ AAGTGAGCAG TOTTGGAGTG 4260 
AGGACCTGTA AGGCSiGGCCC ATTGaGAGCT ATQGTGCTTG CTGOTGOCTG CCAOCTTCAA 4320 
GTTCTGGACC TGGGCMPGAC ATCCTTTCTT IIAA3G»TGC CSUTCGCAACr TUaoSKtXGC 4380 
ATrrTTATTA AAGCATTXCS2 7ACCAGCAAA GCAAA<XGTTG GGAAAGTATT TAiJX A-ry TOO 4440 
OTTTCAAAGT GATAGAAAAG TOTaGCTTOG GCATTGAAAG AGGTAAAATT CTCKAGATTT 4500 
ATTAGTCCTA ATTCftATCCT ACTTTTOGAA CACCAAAAAT GATQOGCATC AATCyTATTTT 4S60 
AarCTTATCTT CTCAATCTCC TCTCTCTTTC CTCCAOaCAir AATAAOAGAA MCTCCTACT 4620 
CACACrrcaiO CIGGGTCACA T C CR T OOCTC CATTGATOCT TCOKCGCATC TTTCCSUEQCA 4660 
TTACCTCCSAl? OCATCCTTCC AACaTATSftTT TATTGAGTAC CIACTGTGTO CQVGGGSCTG 4740 
GTGGGACAGT GGTQACATAG TCTCTGCOCT CMAGAGTTG ATTGrCEAGT GAGGaftGaCai 4900 
AQCATTIXTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGXCACAAGT GGTGTTTATT 4860 
GCAATAACOQ CITGG3?TTGC AAOCXCITTG CXCAACAGAA CATATGTTGC AAGACCCTCC 4920 
CAXGGGGGCA CTTGnOTTTT GGCAftGQCXO ACAGnGCICT GGOITSTGCA CATTTCTTTa 4980 
CATTCC3USCT GICACTCTGT OCCTTTCTAC AACTGATTGC AACAGACTGIT TGACTTATGA 5040 
TAAGAOCAGT GGGAATTGCrT GGAOGAACCA GAGGCAGTTC GAGCTTGGGT GGGAAGACTR 5100 
TGOraCTGOC VrOCTTCTOT AX!rrCCTTGG ATTXTCCIGA AAGTOTTTTT AAAXAAAGiAA 5160 
CAATTGOTAG ATQOC 

Beq ID 190 s 390 Protein ggguence 
Pxotein Accessiozi #s liIP_0aS553 . 1 

1 11 21 31 41 51 

I I I I I I 

HPAUniiGCCZ. CFSliCiLPAAR ATSSREVCZlC MGKSRQCIFD HBLHSQTGNG FRCIAICKDNT GO 

DGlHCBKCKN GFYIOIREJUm CLPCNCfiTSKG SLSAHjCDNSG RCSCKPGVI^G ARCDRdiPQP 120 

RMI/rDAGCTQ DQRLIJJSKCD CDPAGXAGPC DAGRCVCKPA VTGERCDRCR gCJYTfllLEJGGN 180 

PBGCTQCFCY GKSA8CRSSA BVSVEKXT8T VPQPWCmKA VQRNGSPAKI* QS79C2RBQDVF 240 

SSAQRLDPVY FVAFAKFLGH QQVaVBQSLS fiI3VSVX»tGG& HPSAHDVIIfi GAGZAIIftPL 300 

HPI^SECriiPGG IfT K TyryBIJJ SSS&mSSQ IiSVFBVSRLXi BKIiTALRIBA TYGBySTGYl 360 

DSnrFIilSARP VfiGAPAPWVE QCICPVGYIQS QFOODCASGY KRDSAHIiOPF GTCXPCNOQG 420 

GGACDPDTGD CVSGDEETPDI BCADCPIGFY HDPHDPRSCIC PCPCHtKSFSC SVMPETEEW 480 

CSmCPPGVTG ARCELCADGX FGDPFGBRGP VBPOQPOQQf NHVDPSASGN CDRLTGRCDK 540 

CiaHTAGIYC SQCRAGYFGD PIAEMPADKC RAQiCSIPHaS EPV6CRSDGT CVCKPGPGGP 600 

nCBBGKFfiCP ACnjiafVXigH HQFMQQUKEtM EKLTSKAQGG PgWBDTBLB OSHQQABQAXi G^D 

QDIIjRSAQXS EQASRSLGI^ IiAKyRSC^EafS YQSSOiDDIiXH TVERVRALGS QKQNRVStDIH 720 

niilTEOMax-SZ. ABSEA6LGHT IffXFASDHYVG PNGPKSLAQB ATHI1AE8HVB SASHMEQUTR 780 
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PCT/US02/36810 



ETBDYSKQAL 
fiDRSYQKSLR 
NHKEEAQQIiL 
QVDMRKAB?^ 
IBQBIQSUEnj 
AQKyDTRAKfeT 



SIATSKAZflEG V0SG9GSPD6 AWOGLVEKI. EKTKSLaQQI* TSEATQABIE! 840 

LLDSVSRLQG VSDQ8FQVEB AKAIKQKADS 2j8TLVTKHMD EFKRTQKHLG 900 

QN<3KfiGREK8 DQIiIiSSANIiA KSRAjQEAZiSH OiHATFYBVES lUOaZiREFDI* 960 

EAMKHIiSYIS QKVSDASDICT QQAERALGSA AftDAQRAKIIG AGEAIiEISSE 1020 

E»A?VTAD6AI» AKBKQIJVSIiK SEHRBVSGBL KRXEIaEXDTEl MDAVQMVITE ID BO 

ACVTIODfTUU TliDGLUILHD QPLSVDEEGI. VLLGOKLSRA KIQIKSQLRP 114.0 
QQRGHLRLItE TSIDOIIiADV KNLENlRDEJTi PPGCYNTQAL EQQ 



Ssq ID NOs 391 DNA sequencs 
10 lilucleic Acid Accesalon #: AF10L051.1 
coding sequence X 321.856 



1 11 21 31 41 51 

1 I I ' ' ^ 

15 GAGCAAOCTC AGCTTCrAQT ATC2CftGACTC CAGCXX3CGOC CCGQGCGOGG ACTCCAACCC 60 

CDGRCCCAGAfi CITCTOC2U5C OGOGGCOCnG OQAGCAGGGC TCCCC3GCCIT AACTTCCTCX: 120 

GCGGGGCX:CA QCCACCTTCG GGAGTOOGGG TTQCCCAJCCT GCAAACTCTC CGGCTTCTGC 180 

ACCTGCCACC CCTGAGGCA6 CGCGQGCQCX: CGAjSCGAGTC ATGGCC3XACG OGGGGCTGCA 240 

GCIOTTGGGC TTGATTCTOS CCTlfCCTQGO ATQOATCSGGC GCCATOGTCA QCACTQCCCT 300 

20 GCCCCAOTOQ AGGATTXACT CCTATGCCGG COACAACATC GTGACOQCXX: AGGCCATGTA. 360 

OGAGOGGCTG T06ATGTCCT aCOreiGGCA eftGCAGGSGO CAGATGCAGT gCAAAQTCTT 420 

TGACTCCTTQ CTGAATCTGA GCAGCACATT (3CAAGCAACC OGTGCCTTSA TOGTG6TTOS 480 

CATCCTCCTG CaAaTORTAO CAATCTTTGT GGCCACCGTT OGCATGAAGT (aTATOAAOTG B40 

CTTCSGARfiAC GATGAGGTGC AGAAOAIGRC GATGGCTGTC ATTQOGGGTG CGATATTTCT 600 

25 TCITISCAGOT CTGGCTATTT TA23TTGGCAC AGCATGGTAT GGOUVTAQAA TCOTTCAAGA 660 

ATTCTATGAC GCTRICnCOC CASTCULTGC CAGGTAOQU TTTGQICAGG CTCTCTTCftC 720 

TGGCnSQQCT OCTGCTTerC TCTGCCTTCT GGGAGGIIGCC CTACTTTQCT GTTCCreiCC 780 

CCGAAAAACA AOCTTCTTACC C:2UVCACCAA6 GCCCTATCCa. AAACCTGCAC CTTCCAGCQQ 840 

GAAJtfiaCTAC ertSTGACACA GAGGCaUiAAQ GiWSAAAATCA TGTK3AAACA AAOOGAAAAT 900 

30 OCSACATTGAG ATACTATCAT TAACATTAGG AC3CTTAGAAT TTTGGSEATT eTAATCTOAA 960 

QTASGOTATT ACAAAACAAA CAAACAAAJCA AAAAACCCAT OTGTTAAAAX ACTCAGTGCT 1020 

AAACATGGCT TAATCTTATT TTATCTTCTT TCCTCAATAT AJSGAGGQAAO ATTTTACCAT 1080 

TTOTATTACr GCrTCCC3VTT GAGTAATC2AT ACTCAAATGO GGGAAQGQGrr GCTCCTTAAA 1140 

TATATATA6A TATGTATATA TACATGTTTT TCTATTAAAA ATAGACAGTA AAATACTATT 1200 

35 CTCATTATQT TiaAXACTAGC ATACTTABAA TATCICTAAA ATAGOTAAAT GTATTTAATT 1260 

CSCftTATTGAT GaaaATOTTT ATTGGTKTAT TITCITTTTC GTCCTTATAT AO^ATOTAA 1320 

CAOTCAAATA TCATTTACTC TTCTTCATTA GCTTTGGGTG CCTTTOCCauC AJUSAGCIAlGC 13 BO 

CTAATTTAOC AAOOATOAAT TCTTlXaUVTT CTTCATGCXST GOCCTTTTCA TATACTTATT 1440 

TTATTTTTTA CCATAATCTT ATAOCACTTG CATCX3TTA1T AAGCCCTTTAT TTGTTTTGTG 1500 

40 TTTCATTGGT CTCTATCTCC ^XaAATCIAAC ACATTTCMA GCCIACKTTT TAOTTTCTAA 1560 

AGGC3UU3AIU3 AATTTATTOC AAA3XMAAC TXTGGAGGCA AATCTTICTS CAXGAGCAAA 1620 

GTGATAAATr CCTOTTQACE TOCCCACACA ATOCCTOEAC rCSCMXXXT AaCACTCTTQ 1680 

TTTOCTTTSA AAATATTTGT CCAATTOAGT AGCTGCATSC TOTTOCCQCa. GGTGTTGTAA 1740 

CS^CAACTTTA TTOATTOAAT TTTTAAGCTA CTTATTCATA GITTTATATC CCCCTAAACT 1800 

45 AOCTTTTTOT TCCCCATTCC TTAATTGTAT TGTTTTCCCA ASIOTAATTA TCAtEQCXStXT 1860 

TAX3UrCTT0C T2UVIAAGGTG T Otf TLT S lTT 6ICIGAACAA AOTGCTAGAC TTTCTGCSAOT 1920 

GATAATCIOS TGACKAArCAT TCICTCTOTA 6CTGTAAGCA AGTCACTTAA TCTTTCTACC 1980 

TCTTTTTTCT ATCTGCCAAA TTGaGaTAAT GATACTTAAC CAGTTAGAAJS AGGTAQTGTa 2040 

AAIATTAATT AGTTTATATT ACTCTCATTC TrTGAACATG AACTATQCCT ATO:CAGT6TC 2100 

SO '^TATTTGCr CA!9CT0aCTG AGhCACTGAA GAAQTCACTG AACAAAACGT ACAGACQTAC 2160 

CITCKXGIIGA TTCACTGCCr TGCTCTCTCT ACCnOTCTAT TXTOlCllGAA CAAAACCIAC 2220 

ACACATACCT TCATGTOSTT CAGTGCSCTTC CTCTCTCTAC GAGTd'A'ITi' CCACTQAACA 2280 

AAACGTAiOGC ACATACCTTC ATOTIK3CTCA GTGOCTTCXT CTCTCTACCSV GTCTATTTCC 2340 

ATTCXTTCAQ CTGTGTCTGA CATGTTTGTO CTClGTTCCA TTTTAACAAC TOCTCTTACT 2400 

SS TVTOCA6TCT GTACAdQAATQ dATTXCACT TGAGCAA0AT GATQXATGGA AAGGGTGTXa 2460 

CquJiiaCHXai ; CrGGAGACCr OGATTTOlVSr CIIGGTGCTA TCAATOkCCSS TCTGTGTITG 2520 

AGCAIUSGCAX TTGGCIGCTO TAAQCTTAIT GCITCATCX0 TAAItSOtSGTGG TTTOTAATTC 2580 

CPOATCTTCC CACCTCACAG TGATGTTGTG GGSATOCAGT G2VGATAGAAT ACATGTAAGT 2640 

GTGGTTTTtTT AATTTQAAAA OTGCTATACT AAGGGAAAGA AIIGAOGAAT TAACTGCATA 2700 

60 CGTTTTGGTG TTGCTTTTCA AATGTTTGAA AATAAAAAAA TGTTAAQAAA TGGGTTTCTT 2760 

GCCTTTAACCA OTCTCTOIAG TEGAXGAC3ACA OTOAIUSTAAA ATTOftS aGCA CTAAACI3AAT 2820 

AAI^TTCTGA OGAAGTCTTA TCTTCTGCAG T^^iCTATGOC OCAATOCrm CT6TGGCTAA 2880 

ACAOATQTAA TGGGAA0AAA TAAAAGCCTA CBTCTrGGTA AATCCAACAO CAAGGGAGAT 2940 

TTTTGAATCA TAATAACTCA TAfliSGTGCrrA TCTGTTCAST GATGGCCICA 6AGCTCXTQC 3000 

65 TGTTAGCTGa CJlGClGACGC TGCIAGGATA QTTAGTTIG6 AAATGQTACT TCSVTAATAAA 3060 

CTACACAAG6 AAAGTC3U9GC ACCGTGrCXT ATCAGGAATT GGAGCTAATA AATTTTAOTG 9120 

TGCCTTOCRA ACCTGftGAAT ATATCCTTTT OGRAGTTAAA ATTTAAAMQ CTTTTGCCM 3180 

ATACATAGA7 CTTCATGATO TQTGAGTGtrA ATTOCATOTG GAXATCAGIT ACCAAACATT 3240 

ACAAAAATUIT TTTATGGCCC AAAATGACCA ACSAAATTGT TACAATAQAA TTXArTCCAAT 3300 

70 TTTGATCm TTATATTGTT CTAOCACACC TGGAAACAGA CCAATAGACA ITTTOGGGTT 3360 

VTATAATGGG AATTTQXATA AnSCATTACT CTTTTTCAAaP AAATTGTTTT TTAATTTAAA 3420 

AAAAGGAAAA AAAAAAAAAA AAA 

Seq ID NOs 392 Protein gequence 
75 Protein Accession 41 1 AAD16433,1 

1 11 21 31 41 51 

11111) 

MAHAQLQIiIjG PXI*AFIjOWrG AIVSTAIiPQW RIYSYAOEWI VTAQAMYBQI. WMSCVSQSlG 60 

80 QIQCKVEDSL UJI»SSTLQAT tUVLHTVGILL GVIAIFVATV GHKCM KCLB P DBVQKHRMAV 120 
lOQAIFLLAGI IAILVATA17Y OORIVQEFYD VWXJBVmSXE PG(^IfTGHA AASIjCLLQGA 180 
UiC9C8CPKKX iSYPTFRPyP KPAPSeCKDY V 



1109 



wo 03/042661 



8eq XD HO: 393 PHft eequence 

Nucleic Acid Accession ft: mm_006180.1 

Coding sequence- 352^.2820 

1 11 21 31 41 51 

I I 1 I I I 

CCCCCATTCG CATCTAACftA GGAATCTGCG CCCCA.GAGAS TCOOBOACBC CGCJCQGTCXJG 60 

TGCCCGQCGC G03QQGOCAT GCAGCGACGO CX:GCX:G0GGA GCTCCGAGCA GOSCSTAGCQC 120 

CCCCCTCTAA AaCGOTTCGC TATGCOSGGA CCACTQTQAA CCCTGCCGCC TGCOGGAACA 180 

CXCTTCGCTC OGGACCAGCT CAGCCTCTdA TAA6CTGGAC TOGGCAOGCC CGCAACAAGC 240 

ACCGAGGAGT TAAGAGAOCX: GCAA0O8CA6 GGMJQGCCXC OOOGCAOGGG TOGOGGAAAlS 30O 

CGGCOQOTGC AGOGCGGGGA CAOaCACTOS GGCTGGCACT GQCTQCTAlBG GATQTCt3TCC 360 

TGGRTAAG6T GGCATGGACC 0GCCATGGG6 QGGCTOKsaa GCTTCTGCTG GCTGGTTGXG 420 

GQCTTCTGGR GGGCCGCTTT C3QCCTQTCCC AOSTCCTGCA AATGCAQTOC CTCTOMaATC 480 

TGGTGC3U3CXS ACCCTTCTCC TG6CATOGTG GCATTTCOaA GATTGGAGOC TAACAGTGXA 540 

GATCCTOaOA ACRICACCeA AATTTTCATC QCAAAOCASA AAAOQTTAGA AATCATCAAC «00 

GAAGATGAT6 TIGAACTCTTA TGTGGGACTG AGAAATCTOA CAATT6TGGA TTGTGGATXA ££D 

7UUVTTTDTGG CTCATAAAGC ATTTCTOAAA AAC3U3CAAOC TGCAGCAjCAT CAATTTTACC 720 

OGAAACAAAC TQACOAGTTT GTCTflGGAAA CtaTTCCGTC ACCTTSACTT GTCTGAACXG 7a0 

ATOCTGGTGG GC&ATCCATT TACATGCTCC TGTGACATTA TGrOGATCAA GBUJTCTOCAA 840 

GAGGCTAAAT CCAGTCCAGR GACTCAGGAT TTOTACrQCX: TGAATGAAA6 CAGCAAGAAT 90O 

ATTC5CCCTGG CAAAOCTGCA OATACCCAAT TGTG6TTTGC CATClOCRflA TCrOGCOGCA 960 

CCTAACCTCA CTOT^saAGGA AGGAAAGTCT ATCAlCATTAT CCTGTAGTGT GGCAGGTCAT 1020 

CCGGTTCCTA ATATGTATXG GGATCSTTQOT AAOCTGGTTT CCAAACATAT GAATOAAACA 10 BO 

AGCCACACAC AGCaoCTCCTT AAGGATAACI AACATTXCAT (OTATGACAG TGGGAAGCAG 1140 

ATCTCTTOTO TGGGGGAAAA TCTrGXAeSA QAAOAXCftAG ATTCIGICAA OCZTCACTOTQ 1200 

CATTTTGCAC CRACTATCAC ATTTCXCGAA TCTCCAACCT CAlSAOCACCA CTGGTGCATT 1260 

OCa^TTCACTO TQAAAJ3GCAA CCCCAAACCA GCQCITCROT GOTTCTATAA OGGGGCAATA 132 p 

TTOAATGAGT CCATUiTACAT CTQTAC^AAA ATACATGTTA OCAATCACAC GOACTCACCAC 13B0 

GGCXGOCTCC A<3CTGQBVTAA TOOCftCTCAC ATGAACRATa GGOACTACAC TCTAATASCJC 1440 

AAOAATGAGT ATGGGAAOSA T6AGAAAC3US ATTTCIGCTC ACTTCATGGG ClxaacXTTGCA 1 500 

ATTGACGAIQ (5IX3CAAACGC AAATTATCCT GATGXAATTT ATGAAGATTA TOGAACTGCA 1560 

OOaAATGRCA TGGGGGACAC CACQAACAGA AGTAATGAAA I'OCCTTCCAC AGAOGTCACT 1620 

GATAAAACCO GTCGGGAACA TCTCTOGGTC TATQCTQTGG TGGTGArTGC GTCTGTGOTK3 1660 

GGATTTTGCC TTTTGGTAAT OCTaTTTCTa CTTAAGTTGG CAAGAC!AC!TC ITAAOTTTGGC 1740 

ATQAAAGGCC CAGCCIOOBT TATCAGCAAX GATOATBACT CTGCCAGCOC ACICCATCAC IflOO 

ATCTCCAATG GGAGTAACAC TOCATCTTCT TCGGAAGGT6 GCXXaOATQC TOTCaJrrATT 1660 

GGAA.TGACCA ABATCCCTGT CATTGAAAAT CCCC3V£3TACr TTGGCATCAC CAAGAGTCAO 1920 

CTCaAGCX»G ACACATTTGtt TCftGCACATC AAQOSACATA ACAlTOTTCr QAAAAGGGAG 19B0 

CiajSGOGAAG QAGCCTTTGG AAAA6T6TTC CIAGCTOAAT QCilAXAACCT CIGXCCTGAG 2040 

CAGGAGAAGA TCTEGGIGGC AOTQAAQACC CIGAASSAniG GCAGTQACAA TQCACGCAAG 2100 

GACTTCCACe OTOAGGCGGA GCTCCTGAOC* AACCTCCSGC ATGAGCACAT CGTCAAGTTC 2160 

TATGGOGTCP GOGTGGAQQO CQACCCCCTC ATCATGGTCT rrG?W3TACAT GAAGCATQQG 2220 

GAOCXCAfl£A AOTTCCTCAG GGCACAOSGC OCTQATQOOG TGCTGATGGC TGAGGGCAAC 2280 

COGOOCAGOG AACrOACSQCA GT06CAGATG CTGCATAIAO CCCAOCA6AT OGOCGCGGGC 2340 

A^IGGTCTACC TGGOGTOOCat GCPiCTXCBTB CACCGC5ATT TOSOCRCCAG GAACTOCCTO 2400 

GTOGGGGAGA ACTTGCrGOT QAAAAT06GG GACTTTGGGA TGTCXICGGQA OGTGTACAGC 2460 

ACTGACTACT ACAGGGTOtSG TGSOCavCACft. ATGCTSOOCA TTOGCTGGAT GCXrvCCRORS 2520 

AGCATCATGT AC3\GQAAATT caCQAOGGAA AGCGACGTCT QGAGCCTOGG GGT0GTGTT6 2580 

TO GGABA TTT TCACCXATGG CAAACAGGOC TGGTACCAIBC TGTCAAACAA TGAGGTGATA 2«4Q 

GAXn^GTATCA CTCASSSCXSB AQTOCXaCAS OGACCGCGCA CGIGCCCSCCSL GGAGGI6TAT 2700 

GftGCTGATGC TGOQGXGCTG 6CA6GGA6AG CCCCI^CKTGA GQAAGAACAT CAAGGGCAXC 2760 

CATACCCnac TTCAGAACTT ^SCCAAOGCA TCrOOOCSTCr ACCTGOACAr TCTAGGCTAG 2B20 

GGCCCTTTTC CCCAGACCGA TCCTTOCCAA CGTACICCTC AGAQGGGCIG AGWEGATGAA 2BB0 

CAXCITTTAA CTGCOSCTGG AGGCCACCAA GCXQCTCTCC TTCACICXGA CAGTATTAAC 2940 

ATCAAAG ACr COGAGAAiQCr CTCGAGGGAA GCAGTGKSrA CTTCTTCATC CATAGACACA 3000 

GTATTGACTT CTTTTTGGCA TPATCTCTTT CTCTCTTTCC ATCCCOCTTG OTTGXTCCTT 3060 

-jri-i varx-iTTY TARATTTTCT TTTTCTTCTT TTTTTTCOTC TTCCCTGCTT CACGATTCTT 3130 

ACCCTTTCTT TTGAATCAAT CTGGCrrClG CATXACTATT AACTCTGCAT AGACAAAGGC 318 D 

CTT AACA AAC GTAATTTGTr ATATCACCftG ACACTOCAGT TIQCCCACCA CAACTAAGAA 324 D 

TGCCTT^TG T&TTOCIGOC TTIGATOTOG AIGAAAAAAA GGSAAAACAA ATATTTCSUrr 330Q 

TAAACTTTGT CACTTCTGCX GIA£3U]ATAT OGAGAGTTTC ttaaeHTTClL CTTCTATTTA 3360 

TTTATTAXTA TTACTOTTCT TATTGTTTTT GGAIGGCTTA AOCCTGTGTA TAAAAAAGAA 3420 

AACTTOTOTT CAATCTGTGA AGGCTTXATC TATGOGAGAT TAAAAOCAGA GAOAAAGAAG 3480 

ATTTA1TATG AACCX5CAATA TGGGAGGAAC AAAGACAACC ACSGGGATCA GCTGGTGTCA 3540 

G TCCC TACTr AGG AAATACT GAGCAACTGT TAOCXGGGAA GAATOIATXC GGCACdTCX: 3600 

CXTTGAGGAGC TTXCrORjaGA GTAAAAaSRC TRCIGGCCTC TOTGCCRTOS ATGATTCTTT 36«0 

TCCCAlTCAOC agaaatgata gostgcagta gagagcaaag atggcit 

Seq ID NO: 394 Proteiji Bequcpce 
Protein Accession #s 9iPJ30«171.1 

1 11 21 31 41 51 

I I I I 1 ] 

H8SWIHMBOP AMABIfVGFCn LWGFtnEAAP ACFTSCKCSA SRINC8DP8P GIVAfSKLBP 60 

RSVDPE HITK IFIANQEIlZiE IINBDDVBAy VSLUHLTIVD SGLKEVAHKA PMENSSWJHI 120 

HPlTEtHKXiTSt* SRSHFRHLDL 8£LlI>Vt3HSF TCSCDIHRIK TLQBAKSGPD TQOLyCZ«£lES 180 

SKNTFEANLQ IPNOfiLPBAN XiAAPHIiTVHS GK9XTI.SCSV AGDPVFNMYW nVGNHiVSKHM 240 

NETSHXQGSL RiTNISSDDa GKQIBCVAEH LVGEDQDSVH I»TVHEAPTIT PLESPTSOaH 300 

WCIPPT Vaag PKPALQKFyn (SKUMBBKXT CIKZHVnniT EVHGCLQLDEI PIBHHNGDYT 360 

LXAnaEVOKD EKQtSABEVCG KPGIDDGANP VXVmiYEnY GTAANDXGDT TmWBTB&£ ' 420 

DVTDKTGS&B ZiSVXAWVlA SWGFCXiIiVM If LLKIABHS KEGMKGPA&V TSSaSUDShSP 4 BO 

UntlSNGSIST PSSSEGGPDA VIIGMTKIPV IBHPQYFGIT NSQUCnDTPV QHIKRHHIVL 540 

KRBCiGBGAFG KVPLABCUlIi CPSQPKXIjVA VKTXiKDAEDDH ARKDSBREAB IiLTNLQHBHI SOO 
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VKFTOVCVEG DFLIMVFEYM KaGDIiNKFLR AHGSDAVLKA. BGNPPTBLiTQ BOHLHIAQQI 660 

AAGHVYLABQ HPVEEDLATR HCLVGENI-LV KJGDFGHGRD VySTDTCYRVG GETMLPIRHH 72 D 

PFESIHYRKF TTHSDVWSLO WLMEIFTYG KQFHYQUjSNN EVIBCITQ^ VliQRPRTCPQ 780 
EVYELPOiGCH QIIBPBMRKNX lOSTHTLLC^Ii AKASSVYLDI LG 

&eq IS NO: 395 DNA eeqaeace 
ITudelc Acid Accession AF410899 
Cadiixg sequence? 4B3..2999 

X 11 21 31 41 SI 

111111 

GGGAGCAGGA GCCTGGCTGG CTGCTTOGCT CGCGCTCTAC GCGCTCAGTC CCGGGOGGTA 60 

aCRGQAfSCCr GGACCCAGQC GCCGGCGGCG GGOGTGAGGC GCOGC3W3COC GGCCTCX3AGG 120 

TGCATACCGG ACCCCCATTC GCRTCTAACA AGGAATCTQC GCCCCPJSAGA GTCCOOGACG IBO 

CCGCCGGTCG GTtSOOCGGCe CXSC3CQGGCCA TGCRGOGftOG GCOGCCGCSGG AGCTCCGftQC 240 

AGCGGTA6CG CCCOGCTGTA AAGCGGTTGG CTATGCCQQ6 ACCACXGIGA ACGCTGCC6C 300 

CTGCCGGAAC ACTCETCGCT CCGGACGAGC TCaCCXTTCTG ATAAGCTGOA CTC3GGCACBC 360 

CCGCAACAAG CACOCSAGGftQ TTAAQA«3AGC CQCftAGCQCA GGGAAGGCCT CCCCGCACGG 420 

GTGGGGQAAA GCGGCCX3GTG CAGCX3CGGGG ACAGGCACTC QGGCTGQCAC TQaCTGCTAG 4aO 

GGATGTCGTC CTGGATAAGG TGGCATGGAC CCGCCATC343C QOOaCTCTGa GGCTTCXGCt 540 

GQCTGGTTCaT QQGCTTCTGQ AGOGCCGCTT TCX3CCTGTOC CAOOTCCTGC AAAT(3CA0TG 600 

OCTCTCGGAT CTGGTGCAGC GAOCCTTCTC CTGQCATOHT GGCATTTCCB AGATIGGAGC 660 

CTAACAGTQT A£3ATCCTGAG AACRTCAiCCXS AAATTTTCAT GGCAAAGCA<3 AAAAGGTTAQ 720 

AAATCATCAA CGAA0ATGAT GTIGAAGCrrT ATBTC^AGT 6AGAAATCTG ACAATTGTGG 780 

ATTCTGGATT AAAATTTGTG GCTCATAftAG CATTTCTGAA AAACAGC3UIC CTGCAGCaCR 840 

TCAATTTTAC CCQAAACAAA CTGAOOAGTT TSTCTAGGAA ACATTTCOGT CAOCTTGACT 900 

TGTCTGAACT GATCCTGGTG GGCAATCCAT TTACATGCTC CTGrOAtiATT ATGrGGATCA 960 

AOACTCTCCA. ASAGGCTA/IA TCCT^OTCCAG ACSkCTCAGGA TTTGTACTGC CTGAAT<?AAA 1020 

GCAGCAAGAA TATTCCOCTG GCAAACCTGC AI3ATACCCAA. TTISTGGTTTG GCATCTGCAA 1080 

ATdOGCCQC ACCTAACCTC ACTGTGGAGG AAGGAAAGTC TATCACATTA TCCTGTAGTG 1140 

TGGCAGGTGA TCCGGTTCCT AATATQrATT GGQA7GTTGG TAACCTGGT!C TCCAAACATA 1200 

TGAATQAAAC AAGCX!2U::aCA C^USOGCTCGT TAAGGATAAiC TAACMTTTCA TCCGATOACA. 1260 

GTGOGAAGCA GATCTCTTGT GTGGCGQAAA ATCTTOTAGQ AGAAGATCAA GAT^'CrrGITCA 1320 

ACSCTCACTGT GCATTTTGCA CCAACTATCA CATTTCTOGA ATCTOCRACC TCAGACCACX! 13 BO 

ACTGG T GCAT TCCATTCACT GT6AAAGG<^ ACXCCAAACC AGOGCTTCS^ rCOStXCTATA 1440 

ACSGGOaCAAT ATTGAA^TGAG TCCAAAI&CA TCTGTACTAA AATACATGTT ACCAATCACA ISOO 

OOGAGTAGCA CGGCXGCCTC CAGCTOGATA ATCX:CACTCA CATGAACAAT GGGGhCChCh 1560 

CTCTAATAGC CAAG2^TGAG TATGGGAAGG ATGAGAAACA GATTTCTGCr CACTTCATQG 1620 

GCIGGOCTGG AATTGACGAT GGTGCAAACG CAAATTATCC TGATGTAATT TATGAAGATT 16&0 

ATGOAACIGC AOCCSAATGAC ATCGGGGACA CXZAOGAftCAG AACTAATGAA ATOOCTTCCA 1740 

CM3AGGTGAC TGSVTAAAACX: 0CT08GGAAC ATCTCTCQ9T CXATGCTGT6 GVSGXGATTG 1800 

CSarCTGTGGT GGaATTTTGC CTTTTGGTAA TGCTGTTTCI 6CTTAAGTTD GCAAGACACT 1860 

OCRAGTTTGG CATGAAAGAT TTCTCATGGT TrOGATTTOG GAAAGTAAAA TCAAGAC2\AG 1920 

QTOTTGGCCC nxXXnCCQTT ATCRfiCAATG ATGATGACTC TGCCRGCCCA CTCXATCACA 19B0 

TCTCCAATQG GAGTAACACT CCATCTTCTT CGGAAGGTGO GCCAGATaCT GTCATTATTG 2040 

GAATOACCSA QATCCCrOrC ATTQAAAATC OCSCAGTACTT TGGCATC2U:C AftCAGTCAGC 2100 

TCAAGCCAGA CACATTTGTT CAGCACRICA AGCGACATAA CATTOTTCIG AAAAOGGaSC 2160 

TAGGOaAAQG AQCKTTTGGA AAAGTQTTCC TAOCTGAATG CTAIAACXTTC TGTOCTGftGC 3220 

AGGACAAGAT CTTGGTGGCA GTGAAGACCX; TGR3«3GATGC CAGTGRCAAT GCACX5CAAQG 2280 

ACTTCCACOS TGAGGOCBAG CTCCTGACCA ACCXCCASCA l^GAGCACATC GTCAAGTTCT 2340 

ATGGOSTCTG GGIGG9U8GGC GAOOCCCTCA TOLTGOTCIT TOAGXACATG AABCATGGGG 2400 

ACCrrCAACAA. OTTCCTCAGG GCACKCGGCG CTSKTGGGGT 6CTGATGGCT GAGGQCAACC 2460 

OGCCCAOQGA ACTGAjCGCAG TOGCAGATGC TGCA.TATAGC OCASCAGAXG GG06GGGGCA 2520 

TGGTCTACCr GG03TCCCAG CACTTCGTGC AOC6CGATTT GGCJCACCAGG AACTOGCTGQ 2580 

rOGQGBAGAA CTTGCrGOTa AAAATOQGQS ACTTTGOGAT GTOCGGOGAC 6TGXACASCA 2640 

CTGAGTACIA CAGGGIGGGT GGOCACACAA TGCTGCCCASr TOSCXGGATG CCTCCSGnGA 2700 

GCATCA1X3TA dUSGAAArTTC AOGACGGAAA GaSAQGTCTG GftGOCTGGGG GTOGTGTTGT 2760 

G^AGATCTT C!RCCTATGGC AA&CAGCOCT G<5TAOC3«3CT OTCaUUiCAAT GaGGTOATAQ 2820 

A9TGTATCAC TCAGQGCCGA GTCCTGCAGC GACOOCGGAC GTGCGGOCAG GAGGTGTATG 2880 

AGCTGATGCT GGGGTGCTGG CAGGGAGIVSC CCCACATOAG <3AAGAA£!ATC AAGGG<2ATCC 2940 

ATACCCTCCT iTCnfiAACXXG GOOMSGCKe CTOCXSGXCTA CCTGGACATT CTMGCXAGG 30OO 

GOOCTTTTCC CCAlGACOGAT CCTTCCCAAC GTACTCCXCA GAOGGGCTGA GAGGATGftAC 3060 

ATCXTXTAAC TGCCX3CTQGA QGCCACCRAG CTGCTCTCCT TCACTCTGAC AtSTATTAACA 3120 

TCAAAOACIC OQAGAAGCTC TCGAGGGAAG CAGTGTGTAC TTCTTCATCC ATAGACACAG 3180 

TATTGACTTC TTTTTGGCAT TATCTCTTTC TCTCTTTOCA TCTCCCTTOG TTGTTCCTTT 3240 

TTCrTTTTTT AAATTTTCIT TTTCTTCrTT TTTTTOGTCT TOCCTGCTTC ACGKnCTTA 3300 

OOCTTTCTTT TGAATCAATC TGGCTTCT6C ATTACXKTTA ACrcrqC3CTA GAOkAAGGCC 3360 

TTAACAAACG TAATTTGTTA TATCAGCAGA CACTCCAGTT TGGCCACCAC AACTAACAAT 3420 

GCCTTGTTGT ATTCCTGCCT TTGATGTGGA TC3AAAAAAAG GGAAAACAAA TAITTTCACTT 34B0 

AAACITTGTC ACITCTGCTG TACAGATATC GRGAGTTTCI ATGQKTTCiAC T^CCATTIAX 3540 

T^XerHATCKr TACTGITCTT AXTGXrxrtG GATOGCTTAA GGCIGTGZAT AAAAAAQAAA 3600 

ACITGTOTTC AATCT6TGAA GOCTTTATCT A^lGGaAGATT AAAACCAOAG AGAAftGMGA 3660 

Vi-IATTATQA ACCGCAATAT GGGAGGAACA AAGACAMX31 CTGGGATCAG CTGGTGTCAG 3720 

TCOCTACTTA GGAAATACTC AGCAACIGTT AQCTGGQAAG AATQTATTCG GCRCClTtGCC 3780 

CTGAGQACiCr ITCXG2U3G«3 TAAAAAGACT ACIGGCCICT GTGCCRTGGA TGATTCTTTT 3840 

GCCATCACCA 6AAATGATAG OGTGCA6TAB AGRGCftAAGA TGGCTTCXaST GBSRC ROWG 3900 

ATOQCGCATA GTQTGCTCQG ACACAGfTTTT GTCXTOGTAG GTTGTGATGA TAOCACSGC^r 3960 

TTGTTTCTCA. AGOGCTATOC ACAGAAOCTT TGTCftACTTC AGTTOAAAAG AG6XGGAT;rC 4020 
ATGTCC2VGSkG CtCATTTOGS GGTCAGGIGG OftAAGCC 

Seq ID £10: 396 Protein sequence 
^xoteln AocsesBlon. #: AAI.67 965.1 



11 



21 



31 



41 
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wo 03/042661 
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ffSSHIKHHQP 
NSVDPBNITE 
NFTIINKLTSI. 
SKMIFIiAXIIiQ 
BBISHTQSSIj 
HCIPFTVXGH 

huaaxBt&nS} 

VGPASVIGND 
KPnTFVOHIK 

PTEIjTQSQMD 

CITQORVtQE 



I 

AMAHIjWGFCH 
IFIAWQKRIiE 

tPNCGLPSRN 
BITNISSDDS 
PKPALQWPW 
SKOISAHFMG 
LSVYAWVIA 
DDSASFLHHX 
RHNIVLKREli 
U^HEEIIVKFy 
BZAQQZAAGH 
LPIRHMPFB8 
PHTCPQEVYB 



i>WGFWRAAF 

SELrLVGNPF 
liAAPHiyrVEE 
GKQlGCVAEir 
OAIIiNESKYI 
HPQIDDGANP 
eWGPCIiIATM 
SNGSNTPSSS 

GVCVBGDPLX 
VYIJ\£QHFVH 
IHYTSKFTTEG 
LMLGCHQSBP 



I 

ACPieCKCSA 
VGUiNIiTIVD 
TCSCDIMMIIC 
OKSITLSCSV 
LVGEDQDSVK 
CTKIHVTNHT 
NYPBVIYJEDY 
TJFhLYLKBHS 
E06PDAVIIG 

KVFEYMKHGD 
RDLtATRWCXiV 
DVWSIXJWliH 
HHRKHXKSIH 



I 

SRIHCSDESP 
SQE.KFVABKA 
TIJQEAKSSPD 
AODFVPMMYW 
LTVHFAPTIT 
SYHQCLQLDfil 
<5TAAJIDIGDT 
KK34ICDFSWP 
MTKIPVIBSP 
DKILVAVKTIr 
UIKFI^RAHiGP 
GEin^VKIGD 
EIFTYGKQPH 
TLLQHLAKAS 



GIVAPPRLBP 
FLKNSNIiQHI 
TQDLYCLNES 
DVGWI*VSKHM 



PTHMNNGDfYT 
TNRSHEXPST 
GFGKVK&KjQG 
QYFGITWSQIi 

DAVLMABGNP 
P(3MSHDVYST 
YQLBNNEVIB 



Seq ID £K>: 397 I2HA. sequepce 
Kuolelc Acid Accession #s AB052906 
Coding seguenctis 74.. 814 



1 
I 

AAAACCTTQA 
CTCTGGGTCC 
GCTCCTC3CTG 
CMCaCOCSTC 
GOWrOAAAAG 
CCTGGGGAW? 
OBTQGrrGGAC 
CGAAGGCCTC 
TtSGATCT-rOG 
AATGT06ACA 



CTTCTTGATG 
CTCACSGCACA 
CATCCTCCCC 
AAQCTGATAC 
CCMSCTGOOC 
TQQACGCAAT 
TAOCTAACAT 
TTCTG<3CTGIA 
GTACITCTTT 
TZUiAClTCAG 
ATAAGAAAAA 
TTXRAATAAA 



11 

! 

QQTGATTCaT 
TTAATGGCAO 

TCCoacroGT 

ATCCCTAAGT 
ACTTTTCTTC 
AAACTAAATS 
ATACTTACA6 
ACCCTGCAGG 
CAGTTCAGTT 
AGGGTTCATC 
ATGTCCTTCC 
G6CATQC3ACA 
ACCCAACTCA 
TGCTXXZATCC 
CAAAAGGCTC 
AOGACCTAOS 
AGCTCATTCA 
ATTATI3C3UCT 
CTAAACAAGA 
GAATOATQAT 
AGCTCTGGG6 
ATTTATATTA 
GAOTTCTATT 



31 

I 

CTTCCAGGCT 
CAGGOQCOGC 
OCOGGGCTGG 
TCAjOACCTGG 
ACTATQACXa 
TCS^ChACGGC 
AGCAACTG06 
CCAGGATGTC 
TCOATGGGCA 
CTGGASCCAC3 
ATTAUTl'L^L'TC 
GCADOCTGGA 
GGGC3CAC5VSC 
TCCCTGGCAT 

ctotqacx:ac 

GTGTATGTOC 
CTGCXTTTGAT 
TTTCTCTTGG 
TATATC31TTT 

ATTCTTTOGG 
ATGATTGTTT 
TCCCRRAAAA 



31 
I 

CTCCTTCCAT 
TAOCAAGATC 
QCGRQCCGAC 
ACCAa?GTgC3 
TGGCAACAAG 
CTGGAAAGCA 
VGACATTCAG 
TTGTGAGCSU3 
GATCTTCCTC 
AAAGATGAAA 
AATGGGAQAC 
GCC&AGTGCA 
CACCACXXTC 
CrGAQGAGAG 
GGTCTTGATC 
AOTGC3CCTGC 
TCCTTTTGCC 
TaCTAdCTOA 
TCTTTCTTCT 
CAAATQATAU? 
TGTCCTGAAA 
CCTTTAGTAA 
AAAAAAAAAA 



41 
I 

CAAGTCTCTC 
CTTCTGTGCC 
CCTCftCTCTC 
TGTGOGGTTC 
ACASTCACAC 
CIVGAACXXW3 
CTGGAGAATT 
AAAGCT6AA6 
CTCTTTOACr 
GAAAAGTGGG 
TGTATAQOAT 
GGA6CAOCAC 
ATOCTTTOCr 
TCCTTTAGA3 
AAACTCQCCC 
AGCAGATCAT 
AAOUITTTTA 
TGGAATTCCT 

OrGTCAGTAAA 
6AGAATTT3T 
TTXATTGTTC 
AA 



51 
I 

CTCCCTAGCG 
TCCGGCTTCT 
TTTGCTATGA 
AAJGGCCRGGT 
CXGTCAGTCC 
TACTOAGAOA 
ACACRCCCAA 
GACACAGOU3 
CAGAGAftGAG 
AOAAXGACAA 
GGCTTGAGGA 
TCGCCATGTC 
GCCXCCTCAT 
TGAC3U3GTTA 
TTCTGTCTGG 
GATQACATCA 
CCAGCAGTTA 
GCACTTAAAQ 
GGAAAATCAA 
ATAATCAOXr 
AAA*XimTTA 
T6TACTOAT3L 



seq ID NO* 398 Protein sequence 
Protein Accession #s BABi5104a.l 

1 IL 21 31 41 51 

I t I I I I 

MAAAAATKIL LdiPIiLUiLiS GfifSRAGSlADP HSLCYHITVI PKFRPGPRWC AVQGOVDBKT 
FLH!flXX3NKT VTPVSPLGKK tNVTTAWKAQ NPVLKEWI>I IiTEQUiDlQl* EfWTPKEPLT 
J^QABMSCBQIC AE(3E[fifiGSWQ FSFDGQIFXJi t'D<iHKRMWTT VHPGARKMKS KNEtaDKSTVAH 
SSHVPSMGDC ZGniiBPFXil«3 MDSTLBPfiAS APLAMSSOTT QLRATATU^LI LOCLLIIIiPC 
FZLV8Z 

Seq UOi ^99 UNA sequance 

ITucleic Acid Acoessicm #i HM_^001B9B.l 

Coding seqtiences 57.. 482 



G6CTCTCACC 
CCCA£3TATCT 
GGCCCAAGGA 
AOTGGGTACA 
ACTACTACAG 
ATTACTTCTT 
A£3AiOCXGTGC 
TCTACGAAGT 
A8GGATCTGT 

ocaccccTGG 

GACAGACAGA 
CTTCCTTCTT 
AAACAGTA6C 



11 
I 

CTOCTCTCCT 
0AQTACOCTG 
GGAGGATAGO 
•□CQTGCCCTT 
ACGTOOGCTG 
06A0GTABAG 
CrrCCATGAA 
TCX^CTOQGIAG 
GCCAGGCCAT 
ACTGOTBGCC 
GAAGGCT6CA 
GCTTCTAATA 
ATCGOC 



21 
I 

OCAGCTTOCAG 
CTGCTCCTGC 
ATAATCCCGG 
CACTTCGCCA 
GQGQXACTAA 
GTGGGOOGCA 
CW30CAGAAC 
AACAGAAGGT 
TCGCRCCAGC 
CCCAOCXTTGC 
GGAGTCCTTT 
GCCCTOGTAC 



31 

1 

CmGTGCTC 
T(3GCC2waOCT 
GTGGCATCXA 
TCAGCGAGTA 
GAGOCAGGCA 
CCATA7GTAC 
TGOVGAAGAA 
CCCTGGTQAA 
CACCftQXAC 
GG6AG60CTC 
QTTGCTCAGC 
ATGGTACACA 



41 

1 

TOCSCTCTGftG 
AGCI6T0GCC 
TAA£3Ga^AC 
TAACAAOGOC 
ACAQACCGTT 
CAAG^CGCAG 
ACAGTTGTGC 
ATCCAGGIGT 
TOCCftCOCCC 
C0C3VTGTGCC 
AOGQCGCrCT 
CCCOOCCACC 



9eq ID HOt 400 Protein Beqnence 
Protein Acceaaion 4: NP 001889-1 



€0 
X20 
180 
240 
300 
360 
420 

4ao 

540 
600 
€50 
720 
7B0 



60 
X2Q 
160 
240 
300 
360 
42 0 
480 
540 
600 
660 
720 
780 
840 
900 
d60 
1020 
1080 
1140 
1200 
1260 
1320 



60 
120 
180 
240 



51 
I 

GAGACCATGG 
CTGGOCTGGA 
CTCAATGWTG 
ACCAAAOATG 
OGGOOGGTGA 
OOCAACTTGG 
TCTllTGGAfiA 



TGTASXGCTC 
TGCGCCAA0A 
GCCCTCCCTG 
TCCTOCAATT 



6D 
120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 



X 11 21 31 41 51 

I I I 1 t 1 

MAQYLSTUiL UjATLAVALA flSPKBSDRZJ PaGXYHADLN X>8HVQRAIiHF A1.BEYKKKSK 
DDYYRRPLRV lAABQQTVGG VmTFFDVEVG RTICrKSQBN IiDTCAFHSQP EXiQKKOLCSF 



60 
120 



1112 



wo 03/042661 



PCT/US02/36810 



BIYEVPtlENR Rfi£»VKSRCQB 8 



Sea ID 401 PMR. gecraence 

Huclelcs Acid Accession iaM_0O3976.2 

Coding sequence; 299-961 
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1 
I 

CTCTGAJQCTT 
CATGGAGTTG 
CTACTTCTQC 
GGGTGGCAG6 

TGCCCTQTGG 
GOGCTCCGGG 
CGCC30QCC3U:: 
GCXX3CCX3C3i6 
OGGGGGCDQC 
GOOCTTGCGGG 
OGACGftGCTO 

GGCCGTCM3C 

AGGGCTCOCT 
CCTCCCX3C&G 

AiaaccccTAC 

OGAGOlXTTTC 

cccTCCTcra 

CCTGTACTCA 



11 
1 

CTCTGASCCT 
TGAAAGAATA 
TGGGTTQRGr 
COSGTCOCCC 

CTTGGncaGCC 
CCCRCCCPGG 
CCCCGCAGCC 
CTGCGGGOGG 



CTGCGCTOGC 
QTGOSTTTCC 
CTGGOCAGCC: 
CAGCCCTGCT 
TGGft(3AACCB 
OCAGGGCTTT 
AGTCCCACTA 
Q06TGG6TGA 
GCGCTOIOCC 
GGAOCCACTT 
ATGAACAXrrA 
AAGGACACAT 
CTCA1GGGAG 



21 
I 

TGTTTGCTCa. 
CSCTOCAAAGC 
CTAGCTGTOT 
ACAAAAGATA 
TCAACAATGG 
TCTCCRCGCI 
CCGCTCTGOC 
CTGCXX:CCCG 



AGCTOOTGCX: 
GCTTCTGCAG 
TACTGOGCOC 
GCOGAOCXZAC 
TOGACCXSCCT 
GC3VGACT0QA 
GCCAGCGGCC 
TGGATATC3W 
TGCGGATOCC 
CTCAC3U3}\CC 
CIU37GGCTQA 
ATTQCAGTTG 
CTGGCCCC 



31 
I 

TCTGGAAftAA. 
ACCTAACACA 
AGGCCCCTT6 
ACTCATCTCT 
CTGATGOGCG 
6T0CCACTGC 
TCTGCTGAGC 
CGAAGGCCCC 
CX33CTGGTGG 

OGGCAGCGGC 
GGTG06OGCX5 

OGGGGCCCT0 



41 
I 

6GGQATTAAA 
TAGTAAGGTT 
TTCCTCACXrr 
TAATTTGCAA 
CTCCTGGTOT 
CGCTGGCCTA 
AGCGTCGCAG 
CCGCCXOICC 
AGTGGAAOAG 
CCCXXATCTG 
OCTOG6GCA9 
CT03aCCTGG 
CGCGGOSOGC 
COACCGCCCC 



CTUOGCCACC 
CCCTTACCGG 
TCAGCCAGGG 
CCCCGAACAG 
ASCCTAAAAG 
CTGGGACTGG 
(3QCA.<XCA!GCC 
CTTGGTTQAA 



GCCTGOGGGT 

TGSCTcrrcc 

AGGAAGGOCT 
OTOAAOGGAC 
AC%GCA0A0ft 
GCAGGCCXGG 
CCOGCCCAGQ 
AOIGOCTGTG 



51 
I 

CCATTTACCT 
CCCAGTGCAG 
GGAGAAACTG 
GCTGCCTCAA 
TOATAGAGAO: 
GGOGGCAGCC 
AGQCCTCCCT 
TGGCGTCCCC 
CCCGQOGGCC 

osGGGGoeae 

GCCACCSCTC 
GCTCrCCACA 
GGGGCTOCCG 
TCATGGAOGT 
GOCTGGGCTG 
TGCCTGQGAC 

CAAAec7r<3Aa 
AACIGftCTAG 

AACCTGGGAC 
CCCTGTAGOS 
CIGGAAGTOG 



60 

12 D 

lao 

240 
300 
360 
430 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 

loao 

1140 

laoo 

1260 
1330 

13 BO 



Seq ID TSOi 402 Protein saquence 
Protein Accession #! HP_0O3967.1 

1 11 21 31 41 51 

1 I I I i i 

HBDGLGGLST LBHCFIfFBRQ FALHFTIAAL ALLSSVAEAS LGSAfill&PAP RBGFPFVLAS 60 

PAGRIiFGGRT AaHCSQRAUK PPPQP&RPAP PPPAPPSAIiP SGGSAARAOG FGSKHSAAGA 120 

SGCR1J16QLV PVSALGLGSR SDBLVHFKFC GGSCRRKRSP BDIi&LASLU3 AGAIiRPPPGS 180 
RPVSQPOCRP TRYSKVSEVSD VHSTttRTVDR ItSATACGCLG 

Seq IS NO: 403 PHA sequence . 
vudele Asid Accession «s »H_057091.1 
Godiiig eeqiaences 783..144S 

1 11 ai 31 41 51 

I 1 1 I I I 

ACTGGOCOCT Q3UaAaiUU3?LA ITOGGGTGGAG CASAjSAGCAG CTGCTQCAGa GCAOACAGCC 60 

GCSMlCCCCfA ATCTGCACX3T ACCAQCSAfiTC AGOOGCCCCA OGCAfiOGAOC GGCTTACCCC 120 

TCGCTCCCOa CCCTCACrCA CTTTCTOCOG COCTCGGCOC GGOCTCCCAO CTCTCTACTX 180 

CGCaSTGTCTA CAAACVCAAC TCGOSSTTTC OGTGOCTCTC CACOGCIOGA GTTCTCTACT 240 

CTCCBWaVTCG GAGGGGOOCC TCOCMCATC TACGOCCCTC OCAACCTCQG GGGACCTAGC 300 

CAAGCTAGGO GGGACrrGGAT OCX3ACGGGTG GAGCAQOCAG G<£GAGCOC0G AAAGQTGOGG 360 

CGGGGCAQGG GCGClCCCAG COCCAOCCOS GGRTCTQGTG AOGCTGGGGC TQGRATTTGA 420 

CACGGGA0G6 CTGOGGCGGC QGGCAGGAGG CTGCTGAGGB ATGGftGTTGG GCCCGGCOOC 480 

CM3ACUUaaC COGGGGGCTC CGCXSWGCAaC AQGTOOCTGS QOOCCCUGCC CTOGCXOOCA 540 

COCGGGCSCTG OAGCOOCACA OCGGAGGGTG CAQACrGOCr GOCAA0GCCA CACTTTTGGC 600 

rrAAAAjQAGGC ACTGCCAOGT GTftCAGTCCI GGGCATG08C TOTTTGAGCC TOGGGG<SVSA 660 

GCCCAGCACT GQTCCCOSGA AAGGTGCCXA GAAfSAACAAG GTGCRGGACC CX3BTGcrGCC 720 

TCAACftOGAG GGTGGGGCJAA CAJ3CXCAACA ATGGCTGATQ GGCGCTCCTG GTGTTGATAG 780 

AGATGQAACT TGGACTTGGA GGGCTCTOC&. CQCTGTOOCA CTGCCCCTOS CCTTAGGCGGC 840 

AGGCieOOCT GTGQCCCACC CIGGCCGCTC TGQCTCTGCI GAGCAGOGTC GCACVUSGCCT 900 

CCCTGOaCTC CGGGCGCC3SC AGCCCSOCCC CCC3GCGAA60 CCCOCCGCXTT GTCCTGGC3GT 960 

CCCOOGCOGG CXACCTQCCG GQGGGACGCA CGGCCCaCTG GXGCAGTGGA AGftGCCCQGC 1020 

QQCCeCJCGCC GCAGOCTTCT C0QCC03CGC CCCGQC06CC TGCACCXXCA TCTGCTCTTC 1090 

CCC6GGGG0O CCGOSOGGOO GGGOCIGGGG GGOOGGGCAG CC6GGCTOGG 6CAGCGGGGG 1140 

GGOGGGGCXG COG OC TGQGC TOGCAGCTGG TGCCGGO^GOG GG06CTOGGC CTGQGOCACC 1200 

GCIOOGAOGA OCTGOTGCGT TTCOGCTTCT GCftGOSGCTC CTOCC3GCCX3C GOGCX3CTCTC 1260 

CACACQACCT GAGCCTGGCC AGCCTACTroQ GCQCCXSGGGG OCTGOGACCG CCECCxaSGCW 1320 

CCOGGCOCGT CAOCCMaCCX: TGCTQCCGRC OCACGOGCTA CGAAGCSGGTC TCCTTCATGG 1380 

AOGTCS^ACAG CACdGGAfiA ACaGTGGAOC GCCTCXCOGC CS^OOGCCIGC GOClGCCrOS 1440 

GCTGAGOGCrr OGCTCCAGGG CTTTGCAGaC TGGACCCTTA CCGGTGGCTC TTCCTGCCTG 1500 

GGACCCTCCC GCAGAGlCiCC ACI!2WGCCAGC GOCCTCAGCG AGGGAOGAAG .GCX:TCftAAGC 1S60 

TGAGAGQCCG CTACOGGTGG GTGATGGAXA TCATCCC3CGA AEZAGGTGAAG GGAGAACTGA 1620 

CTAGCAGOCC CAGAGCCCTC ACCGTGCGGA TCCCAGCCTA AAAGACACCA OAGACCICAG 1680 

CTATGGAOCC CTtTGCaQACXZC ACTTCICACA GACTCTOaCA CTGGOCAGGC CTOGAAOCTG 1740 

GGACCOCTCC TCTGATGAAC ACTACAOTQQ CTGAGGCTTC AGCCCCXX3CC CAQGCCCTGT IBOO 

ASQGAC9U3CA TTTGAAGGAC ACATATTGCA GTTGCTTG6T TGAAA6TGCC TGTGCTGGAA 1860 
CI06CCTGIA CXCACTCKTG GGAGCTGaGG CC 



Seq ID HQs 404 PgQteln seciueiice 
Protein Accession #s MP 003967.1 



1113 



V^O 03/042661 



PCTAJS02/36810 



1 11 21 31 41 51 

111111 

MELGIiGGLST I»SHCPWPRHQ PALWPTIAAL ALljSfiVAEAS 1*GSAPRSPAP REOPPFVLAfi €Q 
5 PAQHLPGGRT ARWC5GRASR FPPQPSRPAF PPPAEFSAL? RQGRAA3UU16 PGSRARAAGA IZQ 

EGCRLHSQliV PVRALGLGKR SDEliVRPRFC SGSCRRARSP HZHiSIASLLS AI?ALKPPFG£ IflO 
RPVSQPCCRP TRYEAVSPMD VMSTWRTVDR LSATA06CLG 



10 



5eq[ ID NOi 405 DNA gequence 

Nucleic Acid Acceaalon »i im_057160,l 

Coding set^ence t 1 . , 714 



^ _ 1 11 21 31 41 51 

15 I I I I I I 

ATGOCCGGCC TQATCTCAiSC CC3GAGGACAG CCCCTCCTT6 AGGTCCTTCC TCGCCAAGOC 60 

CACCTGGGTG CCCTCTTTCT CCCTGAGGCT CCACTTGOTC TCTGOQGGCA GCCTGCCCTQ 12 Q 

TGGGCCACCC TOQCXaSCXCT GGCTCTGCTO AQC3VGC2SrCG CAGAC(30CTC CCTGGGGTCC IBO 

GCOCdtlCOtA GCaCfSGOOCC CCGCGAAGGC CCCCCGCCTQ TCCTGQC3CSTC CCCCOOCOGC 240 

20 CACCTGCOGS GGGGACGCAC GGCCCGCTOG TGCAOTGGAA GAGCCCGGOQ QOoaCCGCCG 300 

CAGCCTTCTC GGCCCGCGOC CCCX3CCGCS:T GCRCCCXXAT CTGCTCTTCC COGCGGGGGC 360 

0G0QOGGCX3C QCSQCIGQGGG CCCGGGCAGC OQCGCTOSQG CAGOGGCSGOC GCOGGGCTGC 420 

CGCCTGGGCT OGCAGCTGQT iiCCQaTGCGC GCGCTCGI3CC TGGG0CAC06 CTCOGACOAO 460 

CTGGIXSOQTT TCCMCTTCTG CAG03GCTCC TGCCGCGGCG OGOQCTCTCC ACAOGACCTC 540 

25 AGCCTGGOCA GCCTACTGGa CGCCXSGGGCC CIGCQACCS^ CCGCOGGCTC OOOGCCCSSTC 600 

A0CC3U3CCCr GCTGCCGACC CACGOTCTAC GAAGOGGTCT CCTTCATGQA CGTCAACAGC 6G0 

AOCTGGAGAA COGTGGACCQ CCtCTCOGOC AtXHCCTGCG QCTGOCTGGG CTGAGOGCTC 720 

OCICCAiQSGC llTXGCItfSACT GGACCCTTAC CGGTGGCTCT TCCTGCXinGG GACCCTCCOQ 7B0 

CA6AGIC0CA CTAlSCCAlGCG GCCTCAGCCA GQC^ftCGAAGG CCTCAAAiOCT OAGAGGCCCC 840 

30 TACCtjGTOGG TGATGGATAT CATIXCOOAA CAGGTOAAOG GA£!AACTGAC TAGCAGOCX?C SOO 

AjSA(3CCCrCA CCCTQOQQAT CCCAGCCTAA AJU3ACACCI\G AGAGCTCAGC TATGOAGCCC 960 

TTGOSACCCA CTTCTCACTuS ACTCTQGCIAC TQ6CCAGGCC TCOAACCXGG GACOOCTCCT 1020 

CTGATGftACA CTACAjSTGGC TGAGGCATCA QCCCCCJGCCC AGGCCCTGTA aaOACRGCAT lOBO 

TTGAAGGACS^ CATATT6CAG TTQCTTtSGTT GAAAGTGCCT OTGCTOQAAC TGCCCTfTTAC 1140 

35 TCACTCATGG GASCrGGCCC C 



Seq ID NO: 406 Protftln sequeoce 
FEoteitt Acceaalon tta lip_476S01.1 

40 1 11 21 31 41 SI 

I I I I ^ 1 

MPQEiISASiGQ PLLEVIiPFQA HliQAIiPLiP^ PLGLSAQPAL HPTIAAliAZJa 8BVABA3U39 60 
AE^aSPASfiEG PPFVIASPAQ HLFGORTARW CSGRARKPPP QP&RPAPPPP APPSAiyPSGQ 120 
BAARACOPQS BARAAGARSC SLR8Q&VPVR AliGLGHRSIXE LVRFR^CSBS CBRASSPHDL 180 
45 GliASLljGnSA liRPPPOSRPV SOPCCRPTfiY EAVSFHDVSS TWRTVDRLSA TAOGCIiQ 



50 



Seq ID NO: 407 33HA seguenca 

H^leie AAld Aficasfllon ils ia4JI570S0.1 

Ooding Beqjuence: 2 9.. 715 



1 11 21 31 41 51 

f 1 I I I I 

CTGaiTOC3QC3g CTCCTGGO^r TGATAGW3AT OOAACrrGGA CTTGOAGGCC TCTCCAOGCT 60 

GTCCGACTGC OCCTC3GCCTA GGOSlSCAGGC TOCACTTSGT CTCfCCGOGC AjSCCXGCCCT 120 

55 GTGG0CX:ACC crooCOaClC 'rOGCTCTGCr (3AGC:aGCGTC GCM3A0OCCr OCCTeQGCTC L80 

CGOeOCCOGC AGOOCTGCCC CCCGCJOAAGG CCCXXXSGCCT GTOCIGX305T OQOOOOCXXJa 240 

CCaCCTGCOS CK3GGGACGGA CGGCCCGCTO GTQCAGTGGA AGAOOCXXMC QGOCGCCGCC 300 

GCAGCCITCT OGGCOGOOGC CCSMGCOGCC TGCACCCXXIA TCIGCTCTTC CCOOGOGGGG 360 

ccwaacaaQCG cgggctgggg GcooeoacAG cogcgctogg acsiQcoGOOG cggggggcto 420 

60 CCGOCIGCGC TCQCAGCTOa TGGOOGTOCC? 09CQCTGGGC CTGGGCCACC C3CTCCmC(3A 400 

GCTGG1X30GT TTOCGCTTCT GCAQCQCSCrC CIGC06GGQC OCQCSCXCTC CACACSACCT 540 

CAGCCTGGCC AGCCTACK3G GCGCOGGGGC CCI6GGAOG6 OOCOOQGSCI CGOGGCCOGT 600 

CAGCCAGOCC TGCTGCCQAC GCAC60GGTA GGA2U3CGGTC TOCTTCATCG ACOTTCSIACAG 660 

CAC3CrQl3AGA 2iDCGTGGACC GCCrCT<3?QC (ZMXXSOCTOC QtJCTeCCTGG GCTGAGGGCT 720 

65 CGCTCCAGGCI CTTTGCAQRC TGGACOCTTA OOGOTOGCTC TTCCTGCCra GSACCCTCCC 780 

OCROaerCCG ACTAGOCAGC OGCCTCAGCC ACXSGaVOSAftiS GOCTCAAAGC TGAOAOSGGC 840 

CTACCGGTGG GTGAT(3(?ATA TCATOCCGGA. ACAi6(3T(3A»e» GQACAACXOA CTAfiCAjQCCSC 900 

CAGAGOCCSTC AOCCTGCXSGA TCCCAGCCTA AAAfiACACCA GA0ACCTC2U3 CXATGGAGCC 960 

CTTCXSGRiOCX: ACTTCTCACA GRCTCrG(3CA CTQGCCAQGC CTCXSAACCTG GQACCCCTCG 1020 

70 TCTGATGAAC ACXACAGTOS CTGAGSCATC AOCCGCCGGC CA8Q0QCTGT AGGGACAgCA 1080 

irrTGAAGGAC GXTQCTTQGT TGAAAGTGCC TGIGCTGQAA CTGGOCIGTA 1140 
CTCACTCATO OQAfiCTGQCX: CC 



Seci ID NOs 408 Protein aequ^nce 
75 Protein Accesalon #£ bp_476431.1 

1 11 21 31 41 51 

1 1 1 I I I 

^ MELGLOaLST LSHCPWPRRQ APLGLSAQPA iMPTLAAIiM* l^SSVABASLG SAPRSPAPRE 60 

80 GPPFVLASPA GHEiPGGRTAR WCSGBARRFP PQPSRPAPPP PAPPfiAJiPBfi BSAARAG6P6 130 

SBARAAHARG CRIiRJSQIiVPV RAIGLCORfiD BLVR7RFCSQ 8CRRARSEHD laSIASUiGAG IBO 
ALRPPPGSStS VSQPCCRPTR IfEAVGFMDVIf 9THRTVJ3BL3 ATACOCUS 



1114 



wo 03/042661 



PCT/US02/36810 



Sa<i W NO: 409 UNA Bgguence 

Vticleic Acid Accession. #t Eo& sequence 

Coding eeguencei 1.-1*7 46 

5 1 11 21 31 41 SI 

I 1 1 i 1 i 

ATGCCACTGA AGCATTATCT OCTTTTGCTG GTGGGCTGCC AftGCX:rQC5GC3 TGCIU3aaTTG 60 
GCCTACCATG GCTGCCCTAG CGAGTGTACC TGCTCXiRGGG CCTCCCAGGT GGACjreCACC 120 
GGOGCACGCA TTGTGGCGGT GCCCRCCCCT CTGCCCTGGA ACGCCATGAG CCtGCAGATC IBO 

10 crcAACAaac acatcactga actcaatgag tccccgttcc tcsatatctc AGCOCTCATC 24 o 

GCXOTOAGGA TTGAGAAGAA TGAGCTGTCB OGCATCAOGC CIGC3GGCCTT OCGRARCCTO ZOO 
GGCTC3GCTGC CCTATCTCAG CCTCOCC&AC AACAAGCTrGC AG6TTCIGCC CATGGOCCTC 360 
TTCC3U3GGCC TGGACAGCCT TGAGTCTCTC CTTCTGTOCA OTAACCAGCT GTTGCAGATC 420 
CRGOCGGOOC ACTTCTCCCA GTGCAQCAAC CTCAAQOAGC TGCAaiXGCA CGGCAACCAC 480 
15 CTOGAATAiZA TCCCTGACXSG AGGCTTC6AC CACKTGGTAG OACTCACGAA OCICSiATCTG 540 
GGCARGAATA GCCTCACCCA CATCTCftCCC AOQGTCTTCC AGCACCTGGG ChATCTCC3U6 600 
GTCCTCCGGC TGTAT6W3AA CAGGCTCACG OATATCCCCA TGGGCACTTT TOATGGSCIT 660 
GTTAACCTGC AGOAACrGGC TCTACAC3CAG AACCAQATTG GACTGCTCTC OCCTGOTCTC 720 
TTCCACAACA ACCACAACCT CCAGAGACTC TACCTGTCCA ACAAOCACAT CTCCCACSCTG 780 
20 CCACCCAGCR TCTTCATGCA GCTGCC5CCAG CTCAACCQTC TTACTCTCTT TE^GGAATTCC 840 
CTGAAGQAfSC TCTCTCTGGQ GATCTTOGGG CCCATOCCCR AOCTGCOGGA GCTTTSGCTC 900 
TAKaCAACC ACSmZTCTTC TCTACOCSGAC AATOTCrTTCA GCAACCTGCB CC3U5TTOCAG 960 
GTCCTOATTC TTAGCCGCAA TCJVShTQMSC TTCATCTCCC OaoGTGOCTT CAAGGeGCTA 1020 
ACGGAGCTTC GOOAGCTGTC CCTCCACAOC AAOGCACTGC ACGACCTQGA OGGGAATGTC 1080 
25 TTCCOCAT6T TOGOCMCCT GCAGAACATC ITOCCTGCAGA ACAAT0GCX:T CAGACZUaCTC 1140 
CCAGOGAATA TCTTCGCCAA CBTCAATOGC CTCATOQCCA TCCABCTGCA GAACAAOCA6 1200 
CTGGAGAACT TGC3CCCTOGG CATCTTOOAT CACCnOSCSQA AhCTGTGTOA GCTGOGGCTO 1260 
TATCACAATC CCTGGAG^TQ TQACTCAGAC ATCCTTCOOC TCCGC3ACTG GCTCCXSCTC X320 
AACCAGCCTA GGTTAGQGAC GGACACTGTA OCTGTGTQrr TCAGCCCAGC CAATGTCCQA 1380 
30 QGOCAGTCCC TCATTATCAT G&ATGTCaAC GTIGCTGTTC CAftGOGTCCA T0TCCCTGAG 1440 
GTGCCTAGTT ACCCIAGAAAC ACCATtSGXAC CCAGSlCACAC COVSITAlCCC TGACACCAiCA 1500 
TOaarCTCTT CTAOCACTQA GCTAACCAGC OCTGTGCSUUS ACiAChCTGA TCTGACTAGC 1560 
ATTCAGGTCA CTGATGACCG CACSCQTTTGG GGCATOACCC AGGCCCAGAG OGGGCTGGCC 1620 
ATTGCOaCCA TTSTAATTQO CATTGTCBCC tnGGCCTGCT CCCMGCTGC CTGCGlOGGC 1630 
35 TGrcGCTQCT OCAASAAGAG GAGCCAAGCT GTOCTQATGC AGATGAAGGC ACCCAATGAQ 1740 
<IGTTAAA<3AG GCAGGCTOQA GCAGGGCTGS GSAKTGATGG GACTGGAGGA GCTGGGAATT 1800 
TCATCTTTCT GCCTCCACCC CTGOGTCCAT GGAGCTTTCX: OCTGATTQCT CTTTCTGGCC 1860 
CTAGATAAAG GTGTGCCTAC CTCTTCCTQA CTTGCCTGAT TCTCCCGTAG AGAAGCAGGT 1920 
CGXGCGGGAC CTTCCTACAA TCAiaaAAGAT AGAXCCAACT GGCCATGQCA AAAGOOCTGQ 19B0 
40 GGATTTCCXaA TTCRTACCCC TGGGCITCCT TOGAGRGGGC TCETCCTCCA AATCCTOCOC 204 D 
ACCIGTOCTC CAABAACAGC CTTCCCSGOG COCftGaCiOCC CTGOQGOCCIT CTGTAGRCTC 2100 
AGTTAGTCCA CftGOCTGCTC ACTTOGIXKSG AAiaSTTCTC caCTGAGATA GCCC3OTCT03 2160 
CCTAftflGTATT ATDTAAGTTG ATTTCOCTTG TTTTGTTTCT CTTSTTTQTG CXATGGCTTG 2220 
ACGCA)GCATG rrCOCCTCAAA TGAARGTTCT CCCCTTGATT TTCTGCTGCF GAAGGCAOOG 2280 
45 TOAOTTCrrCT CCTCAAAGAA GACTTOUULC CATTTAACTO GITTCTTAAG AjQCOGTCRAT 2340 
CAGCCTG3TT TTGGGGATGC TASGAAAGAlB AGAAGGhAAA. TC»IdGC3BCT CAGTTCCTGG 2400 
AGACAGAAGA GCSOgTCATCA GTGTCTC3VCX T6IGATTTTT ATCTOGAAAA GGAAGAAACA 2460 
OCOCA0CACA GGAAGCTCAO CCTTTlTiffliO AAflOftTATTT CCAAACTGCA AACTTTGCTT 2520 
TQAAAAGTTT AGCCCnXAA GGAATQAAAT CATOXAOAAT TTTGGACTTC TAAZiAACATT 2560 
50 AAAATC3WaCT TATTAATACXJ GGATAfiAGAA AflAA&TCTGG TfiCClGGGGG TCOCTGTCTT 2640 
CACXWCTAOA GTTTGTTTTA AAATTTTIAA TTGAAOCSKTC TGAaOO^aZAC S3GC3U3AAAA 2700 
GTGGOAACAT GATAGTGTAT GGCITGGTOG ATTTTCACAA ACXGAACATA CCTGTGTAAT 2760 
CAQCATCTAG AC3CCAGACCC AGAGCATTCAC AAATATCCCC C3VTC2CrGGGC TTTTOCXaftGA 2820 
GQAGATGGGG GCTTCTQATiG ATGGACTTAC CUGGGACCTG CXXXZCCATGA GCCftGGT^GGG 2880 
55 TCOCCCCACA GTC2U3CCTGT GCAAAOGCOC OOTGOGGAGG GGTGOAGG&G AATATGTGGG 2940 
TGTGGACAGG ATGGGiUSACT G3GGGCTQAA CAGGAGATIT TATTATATCI GGl^GACCCTG 3000 
AGAGAOOCTG AGACCTGGGG CACCATGGCT GGCCA8GTGA GAAQCATCCT GACTGCABAG 3060 
GTCC3STGCAG CCACACCCTTC TTOCCTGCCA GGARGTTGTG TGOGGCTCAT CGOftGCOCCC 3120 
TCOGCCTOQA QCCTTCTATG GACSGIGATAT GCClGTATCT aTTTTTAATT TTCAXTCTTG 31B0 
60 ACXTAGG6GA AQTGAAATCX3 CTCAGAGA1X3 AGATCCTITA ATTGAAAACQ AAGIGTAACB 3240 
GAATCEAGTG TCTTTCSAAT GTGGIAAAAT TClGCKTCAA CATC!ACaGTC AaCTOSCAGC 3300 
TGAACTTCAG AATCXCAGTT ACAGCnSGOG ACAOGGGGGT ACACCOATGG GTCftCACTQB 3360 
GTCTGQGGGC TCCCTOGAGC TCCTCCIOCXS TGTGGTCTaG ^nAGGAGTTG AGTTGTTTGC 3420 
TGCAGGGTTA TTCTCCTCCT CGAGirCACAS TCACAOGAAT ACCTGCCOrTC TCTQQCTTTC 3480 
65 CXGCTATACA CATATTCACA TG6GSCTCAA GAAdTTAGGC TCATGGCAAC GT6TGICITT 3540 
CICTOGACAA CtOGCCCfUSX TTACAGTGAA AlGGAGAKXT TCSUGGTCTOC AGGTCTGOOC 3600 
AGGAAAfflAC TTCAlSCTGAC TOCAOGQGQA TCTQQAAATC CAOGACCAAT CC0GA3CGGC 3660 
TCTTATTA0C TGOCOGCTCC ACRAGACAOC TGTGCTTTGB AAATCXaCCA CCAATCCCSGA 3720 
TCGGCXCTTA TTAGCTCCCC GCTOCACAAG ACACCTGTGA TCTGGAAATC TAGCACCAAT 3780 
70 CCOGATOGGC TCSTTATTAGC T0CCC3GCTCC ACAAGACAOC TGT9ACAT0C TO%GQGCC}V 3840 
OIGQAOCACG TGCIGACCAG TTTTCGCTTC CAGTTCCTGC AlCAAAAMTQ TCCAGAGGGC 3900 
TGmJGCAAA CACTAOTGCav CTTTGTAGCT TTTCACCCTC TGTGCCAGGG AA TCTAG Gag 3960 
AGATQAGGOC CQTCAGA6TC AAGSUSATGTC ATCCOCCXXlftG QGTCTCCAA0 GCATTTCCAC 4020 
ACTATTGGTG GCACCTGQflG OACATGCACC AAGOCTTGOC KSAGCCAACA GQAAiQTOaGC 4080 
75 OCAGAGGATG GCACATGAGC ATCACCOGCT GATQGTGGCC TGCIGXGOCT GOTGCCAACA 4140 
GGGGCRTCCC GOCCCSGTACC OCXCC3kGACA GGAASCAXGG GTTTGOCCAC AGACGTGTrCG 4200 
GOTOCrOCTQ IGAGTGGCCT DCAGATGTCT TTGXGCATAG GC31CAAST0G GCCAGGGCIG 4260 
GAeCSGAGGTG OOAAAOCTCA TCATCCGGTG GGCCX3QCCA ATCTTAACCC AGAACCCTTA 4320 
OOTAITCCTG GCS^OTAOCCA TGACATTGQA GCACCTTCCT CTCCRGCCAG AQGCTGACCT 43 BO 
80 GAGGGGCACT GICCTCAGAT GACACCAOCC AGGAGCAOCX: TAGSIGAGGG GTCAGGGCCC 4440 
CCnWlGTGA ACCTCrEGCC TCXTCCTTTC TCCCATCRGA GTGGTTaOAT GGAGCGKITQ 4S00 
GOCTCCTTTT CrTCAGCQOG CCCTTCAACC rarCTGCACC ATGTTGTCTG GCTG&GGAGC 4S6D 
llACrAaAAAA QCTGAGTGGA GTCTOCTTTC CAACAGQATG ATGCATTTGC TOATTCTCA 4620 
OGGClGGOWr GAGCOGGCIG GTCCOOCAGA AAOCXGGAOT CSGGGTACAGA GTTCASTTTX 4680 
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5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



CCTCTCTGTT 
GTGTTGGACJA 

GGTAGGAGTG 

TTGTCTTGGG 
TTAQTCrrGG 
OGAAAAAATA. 
TGGGCTGTAT 
AACTTTTCAT 
GAACTTCCRA 
AOmaGTCGA 
GOCCOCAGAT 
GGAJUaCSAAlQC 
CTCCTTCCGC 
CTTCATGCTG 
GCCCCAGTGC 
AOOCCTGGTG 
GGT6TACAGA 



TACAGCTCCT 
AGAAACAACA 
CTCTTCAAaC 
COGCCTCTAC 
AGGCTGOQAG 
CTTTCGICAT 
TCATCAGAAC 
AACICTTCCA 
GTATATTGTT 
QGACACAATT 
ACrrC3\0GAAO 
CA6ATQTTAG 
COCACAGTCA 
CATGGCTGTGr 
CCCAQGTTTC 
OCTTCAAAGG 
TTQQCGATGC 
GGCSVQGOITG 
ATCAACAATA 



TGAiGAGTCCC 
AAAGCCAftTT 
ACTGGACGTG 
CCACTTGTGA 
TTTTATTTAT 
TAAACCAAAG 
CTCACTTGGT 
TCCCTTAAAG 
CTTCCTCCTT 
TCCACAACCT 
TTTGCftGAGA 
ATGTATCCTA 
OAACTGAATC 
6TTCAGA0AG 
TTCTTCXCTT 
TA6ATCATGT 
ATTTACAGAT 
GGQGGTCTGT 
AATAATATAC 



AOGCXXZATCT 
AOAACCACTA 
GATTCTCTCT 
TGI3QQTACAG 
CTCTTCftAAC 
GAAATOSAAG 
ACCATATABA 
AATAOAATAG 
AGAATTTAGA 
TTCAGATC3CE 
GCAGACftSCT 
GCITTTAC3CC 
TOCBITGTTG 
GGTGGGCTGG 
AAi3GMSAGAT 

TTCTASGCCC 
CTTCTGCTQC3 

atgtat 



TTTTTAAAAA 
CTAGOCCTCA 
A6GCACTTGC 
yiTGTACAAG 
CCATTCCOCT 
TCAAAAGCTT 
TTTGTCCCTC! 
QATACAj^GAG 
GA'IGTAGftiaC 
AGAOATAACT 
ATAAACCACT 
GGAAGCCftGC 

TGTTCTCACC 
ZAGIUSAATTft 
TCAQOGTTTT 



CTGQCSAGirCA 
GTGCXTACTG 
GCACOCCTGC 
TCTTCTGCAT 
AiSCICATOGC 

gttgctctcc 
tgtaaccaca 
tcatgggaat 
ttctacttag 

TAa:TGG6AAA 
OGGGACCCnS 
CAAACATTCA 
AGTGGCCTTG 
GCGGG6AAAA 
AACCCX3CTGC 
CTGCAAATCA 
GTAGA9TGTG 
TAATOCATTT 



474D 
4800 
4860 
4920 
4580 
5040 
5100 
S160 
5220 
5260 
5340 
5400 
5460 
5^20 
5580 
5640 
5700 
57C0 



s«q ID NO: 410 Protein oequence 
Pxoteln Accessloa #r BABB45e7-l 



1 
I 

MFLKEYliLLL 
liKTHXTEEiNE 
FQGUJSLBSEt 

FSmiBMIiQHXi 

FBKLAHIiQNI 



lAAXVlOXVA 



11 
1 

VGQQAHGAGL 
GPFLNISAIil 

KVFQHLtSNIiO 
YIiSiasffllSOTi 
NVFSHLHQLQ 
SLQNNRIjRQL 

iLPLiunrLi.li 

EDTPSYPDTT 
LACSUiACVG 



21 
I 

AYHGCPSECT 
AliRlBKHELS 
QPAHF&QC8N 
VlfRLYENEliT 
PPSITWQLPO 
VLILSBHQie 
PCaxiFANVKG 
UQPRXjGTDTV 
gvsSTTKLTS 
CCCCKKRSQA 



31 

\ 

CSRASQVBCr 

ritpgafbni. 

XiKBLQLHGHH 

dipmgtbeqi. 
lsirlilfgns 
fispgafer3l 

liMAIQLQMJQ 
PVCPSPAHVR 
FVBDirrDLTT 
VLMQHKAEISE 



41 
I 

GARIVAVPTP 
GSLSyLGLKSI 
LEYIPDQAPD 
VKLQELALQQ 
ItKELSLOlFG 
TELHELSIjHT 
LSHLFLGIFD 
GQGliIIINVIf 

icfvtddrsvw 
c 



51 
1 

liPKMAKSLQl 

hklqviipigl 

HLVGliTKUOL 
NQIGLLSPGIr 
SKPNLRBLHI. 
HALQESILDGKV 
BLQKLCEIiRL 
VAVPGVEVPB 
GMXQAQSQLA 



60 
120 
180 
240 
300 
360 
420 
400 
540 



Sell IS BIO: 411 DHA aequence 
Mucleic Acid Acoesslon XH_D9B151 
Coding sequences 1..447 



I 

\ 

ATGATQCATT 
AOTGGGGIAC 
TCTGGAGTGQ 
CXATTTTTAA 
TCTCTAQCCC 
CAG3U3GCACT 
AACL'TTUTAC 
AAGGCATTCC 



11 
I 

TGCTCAATTC 
AGAGTTQ^T 
OAGCTGGGAG 
AAAGTGCTTA 

tcagcrcccc 
toctcttctg 
aagagctcat 

CCTGTTGCTC 




ICavGTGTTGG 
CTGTGCACAG 
TGOGGTAGOA 
CATGC3TGTTC 
tSGCTTGTCTT 
TCCTTAG 



31 41 51 

\ 1 I 

AATGRGroGG CTGGTCCCCC AGAAAGCTGG 
GTTTACMCT OCTTGACAGT CCCACGCCCA 
AGAAGAAACA ACAAAAGCCA ATTAGAACCA 
ATACTCTTCA ASCACXGQAC GIGGATTCTC 
GTGCOGCCTC TACCKACTTS TGATGGGGZA 
AATAGGClGe GftGrTTTATT TATCTCTTCA 
GGGCTTTOGT CATTAAACCA AAOGAAATGS 



Seq ID HO I 412 Protein seciaeiic& 
ProtelXL AccesBloxL 1} : XP_098151 

X 11 21 31 41 51 

1 1 I 1 t i 

HHHIi!ll&QG» KBPAOPPSSK SGVQSSVFZiS VYS6LTVPRP SGV6AG8QCH BBXQIKSQIiBP 
X^FLKSAYCAiQ ILFKHHTHXL SLALSISAVQ VPPItPTCDGV QSBLUPOtVF KKTOVIf ISS 
NFVQpBUtACIt GLSSUIQRKW KPFP0C6C 



Seq n> NOr 413 DKA sequence 
KUcleic Acid Acceecion HM 
CodlxLg eequenco: 77..13V2 



002658,1 



1 
1 

GTOOOOaCAG 
CX:CC6ACCTC 
CAGGGACTCC 
TQGAGGAACA 
GAA ATTOGQ A 
TCACTTTTAC 
CTCTGCCACT 
CCTGGGGAAA 
GCAGGTGGGC 
AAAGCCCTCC 
COGCITTAAG 
CATCTACAGG 
CCCTTGCTGG 
CATOGTCTAC 
GGIGQAAAAC 
CATT8GCTTG 
ACAOACCATC 
CACTGGCITT 



11 
1 

CGCCGTCGCO 
GCCACCAIGA 
AAAOGCAGCA 
TGTGT&TCCA 
GGGCAGCACT 
COAfSGAAAGG 
GTCCTTCAGC 
CATAATTACT 
CTAAAGCOGC 
TCTCCTCCAlS 
ATrATTGGOa 
AGGCACOSGG 
GTGATCAGOG 
CrGGGTOGGT 
CTCATCCTAC 
CIGAAGATCC 
TOCCTGCGCT 
GGAAAASAGA 



21 
I 

CCCTGCTGOC 
GAGCCCTGCrr 
ATGAACTTCA 
ACAAGTACTT 
(3TGAAATAGA 
OCAGCACTGA 
AAACGZACCA 
GCAGGRACCC 
TTQTCCAAGA 
AAGAATTAAA 
QAGAATTCAC 
GQQGCTCTGT 
OCACaWIftCTG 
CAAGGCTTAA 
ACAAGGACIA 
GTTCCAAOGA 
OGATGTATAA 
ATTCIAjQOSA 



31 
\ 

GCAQGCCAOC 
GGCGCGCCTQ 
TCAASTTOCA 
CTCCAACATT 
TAAGTCAAAA 
CAOCATGGQC 
T<3CCCACAGA 
AGACAACCGG 
GTQCATOGTG 



raCCATOCSbS 
CACCTACGTG 
CTTCATTGAT 
CTCCAACAOG 
CAGOSCTGAC 
GGQCAGGTOX 
OGATCOOCnS 
CXATCTCXAT 



41 

1 

GA6GCCI9CCG 
CMTCTCTGGG 

tcgaactgtg 
cactggtgca 

ACCTGCTA1?6 
OGGOOCTGCC 
TCTGATGCfC 
AOGCSAOCCr 
CATQACTGG6 
GGOCAAAAGA 
AAOCBG O CCr 
TGTGGAGGCA 
TACCCAAAOA 
CAAGGGGAGA 

AOQcmecTC 



60 
120 
180 
240 
300 

420 



fiO 
120 



51 
I 

CCGTCXAGCS 
TCCTGGTCGT 
ACrSTCTAAA 
ACTGCCCAAA 
AGGGQAATG3 
TQOCCTGGAA 
TTCAGCTGGG 
GOTGCTATGT 
CAGATOOAAA 
CTCTGAGGOC 
GGTTTGOBGC 
GCCTCAirCAG 
AGGAGOACTA 
TGAAGTTTGA 
AOCTUCAACGA 



TTTGGCACAA GCreiGAGAT 
TGAAAATGAC 



60 
120 
180 
240 
300 
3fiO 
420 
48Q 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
lOBO 
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10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



TGTT6TI3AAG 
CAOCACCAAA. 
CTCAGGGaOA 
CTGQGGOOGT 
CTTACCKrraQ 
AGGGAGGAAA 
TCCATCRaCT 
CACCRCC3«3G 
CRGAOCCTCT 

TGTCirmc 

GGCrOGAAGG 
AATGAJVTAAT 
AATGTGGGAiG 
UTTOCATGAA 
GCTGT6AGTG 
ftAJk.CTGTGTG 
CTGQGGCCTC 

ACCTcarOACC 

ATGOCITOCr 
ACACTQAATA 
A3CAATAAAA 



CTGATTTCOC 
ATGCTRTGPTG 

GGATGTGCCC 
ATCCGCAGTC 
OGGGCACCAC 
GTAAGAAGAG 
GTGAACGACA 
CSOCCaCGATG 
TGGACTGAAQ 

TTCCCAATTA 
CAG066TTT6 
TGTATCAQQA 
TJUUSTGTGAG 
GACTGTQATQ 
TTC3CSGTCCOC 
AGCACTQTCT 
TTTAQCCTAG 
TTTATATTTC 



ACOGGQAGTG 
CTQCTGACCC 
GXTCCCTCCA 
TGARGGACAA 
AjCACCAAGOA 
CCGCTTTCTT 
ACTGGGAAQA 
ATAGCTTTAC 
GRQ6GGTGGT 
CCTGCAGGAQ 
TCCCCCGACC 
GGAAGTGTAA 
GOGAGCAGAG 
AAIATATATG 
TAAGAISCTGG 
CCACACA^ShO 
CACGTGACAG 
CAGTTTCACT 
TTCATCCAAT 
ACTATTTTTA 
CTGA 



XCAdGCAGCCC 
CCAATGGAAA 
RQGCCGCATG 
acCAGQCOTC 
AGAGAATG6C 

TAGGCTCTGC 
CCTCACGGAT 
CCTOACTrCAA. 
TTAAAAAGGG 
GGTOGGCATT 
GCAGGTGAGG 
ACACTAAGCSA 
TGTGTGTATG 
TGTCTOATra 



TGCCTGGGAA 
TTCACATAGA 
CCTCACTGGG 
TTTATATTTT 



CACTACTACG 
ACAGATTtJCT 
ACTTTGACTO 
TACACQAGAG 
CTOGCCCrCT 
AVl-TTTGCAG 
ACAOATGGAT 
AGGCCIGGGT 
CATGTTACTG 
CAGGGCATCr 
TGTGAGGCCC 
txttcttqaqg 
CTTCAGGGCA 
TTTGCACACr 
TTAAGTCXAA 
GGAGIU?GTTA 
IXaTACTTATT 



TGOGGTGAGG 
TGTAATTTTA 



GCTCTGAAST 
GCCAGGGAGA 
GAATTGTGAG 
TCTCACACTT 
GAGGGTCCCX: 
XAGAGTCATC 
TTGCCTGTOG 
GCTGGCTGCX: 
ACCAGCAACT 
OCIGTGCATG 
ATGCvflXahGA 
GAGCTTAGCC 
GGOCTCIGAT 
TGTTGT0TQ3 
ATATTXCCTT 
TAG6TCACTC 
CXSCAGGATG 
ttggccagtt 
ACCaCICCTT 
AATAAAAGTO 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1S60 
1620 
16B0 
1740 

leoo 

1920 
1980 
2040 
2100 
2160 
2220 
2280 



Seq ID NO I 414 Protein Beguence 
Protein Accession #s NP_002649.1 



1 
I 

MRAXJARUiIi 
HCBIDK51CTC 

GGBETTIEHQ 
R9RLH8NTQG 
PSMYITOPaFG 
CAADPQWKTD 
SHTKSENGliA 



11 
I 

CVIiWSDSKG 
YEGNGHFYRG 
PHCYVQVGLK 
PWPAArTEiSH 
BMKFBVEHLI 
TGCEITOFOK 
SCQGDSGGPL 



21 

I 

SNEUIQVPSK 
KASTDTMGRP 
PLiVQECMVHD 
HGGSVTWOG 
LHKDYSADTZ. 
EHGTDYLYPE 
VCSIiQGlini^ 



31 
I 

CDCIJffGGTCV 
CI.PWHSATVL 
CADGKKPSSP 
GSblSPGNVI 
AHHNDIAX«UK 
QI>XKtWXLI 
T6IVSH0RGC 



41 

I 

OQTHIAHEIBD 
PSBLKFQ060 
9ATHCFZDYP 
IKSlCBGRCAQ 



AlrKDKPGVYT 



51 
I 

CrrCPKKFGGQ 
ALOLGXiGKHlI 
KTLRP&FKII 
KKEPYJVYLG 
PSRTIQTICti 
YGSEVTTKMLi 
RV5HFLPHIR 



60 
120 
180 
240 
3O0 
360 
420 



Se([ XD laO: 415 QHA aeqaance 

Huclelc Acid Accession #i HH_D24422,1 

Coding fieqtience: 202.. 2907 



1 11 21 

I I 1 

CX30CAAAGGA AAAJSCCOCTT GGATGAGAOG 
OTCTCOGOGC GCOCCAGCTC CTOCXSCCTCB 
GCTCOSGCDQ CQSOCCTOOC CCQBCGGA(3C 
GAJXTGCGOC GAGCCCTCTC CATGQAOOCA 
CTCTQOOGGC IGCTCCTGCT GACCCTOGOS 
AATGTGACAT TACATQTTCC CTCCAAACTA 
CTGAAAORGT GCTTTACAGC TGCAAATCTA 
^TGGAGGATG GTTCAOalCTA TACAACAAAT 
TTXACCA'TAT TACTTTCCAA CACIGMSAAC 
6A6CATCAAA CAAAGtSICCT AAAGAAAASA 
AAISVQAAGAT GGGCTCCAAT TrCCTTGTTCG 
CTTTTOCTTC AAaVGGTTCA ATCTQACAOG 
AGAGGTCCTG GAG7TGACCA AOAAGCTCSQG 
AACTTGTATT GTACTGGTOC TGTAGATC5T 
TrraCftACAA CTCC3«3ATGS GTATAC3CCA 
GAGGATGAAA ATGATTIACTA CC3CAATTTTT 
GAAAATTGCA GAGTGGGCAC TACTGTQGGA 
GAC1M39A.TGC ACACACGOCT GAACTACICC 
CXATITTCTA TGCATCCAAC TAC3U3aCGTG 
GAf^TAATTG ACAA0TAC3C3V GTTGAAAATA 
G6TCTACRGA CAACmCAAC TTGTATCaTT 
ACATTTACTC GTACTTCTTA TGTQACATCA 
TTACGAGTTA CIGTTGAGQA TAAGGACTTA 
AQCAXTTTAA. AO0QCAATGA AAATO BCAAT 
GAAGGAGTTC TTIGTGTAST VAAGCCTTZG 
CAAATTGGrTG ITUSTTAATGA AQCTCCATTT 
AGCAC2U3CAA CAGTTACTGT TAATGTAGAA 
CCAAl!AlCAGA CIGT7CSGAT GAAAGAAAAT 
AAAGCATATQ ACOC!AGAAAlC AAGAAOmGC 
COUU2AGGGT G66TCACX!AT TGATGAAAAT 
GATAGA£BU3G CAOAGACCAT CSUUUUVTGGC 
CAAG6AGGGA GAACATGTAC GGGGACACTG 
AGCXX»TTCA TACCTAAAAA GACAGTGATC 
AITGTTGGG6 TTGATCCTGA TQAGCdATC 
AGTTCTACTT CSUSAAdSTACA GAGAATGTGG 
CGTCTTTOCT ATCAQAATGA TCCTCCATTT 
GATAGACTTG GCATGTCTAG TGTCACTTCA 
GAAAATGACI OCACACATGG TGTAGATCX». 
AAGTGGGOCA TOCTTGCAAT ATT6TTGS8C 
CTGOTCTGTG GGGCrXCTGS GAOGTCTAAA 
OkSCAGAACC WkTTGirATC AAACACftGAA 
AAT06CTTCA C3\ACOCAAAC TSIGGGOGCT 



3X 41 SI 

I \ ] 

CAGGOGCTTC AGAOAAGCTA AGAAAAGCAC 60 

CGCTOCTCCr QAGCAGOSGG iXCAQACTGC 12 Q 

CCTCCTACCC GGGOCCGACG CTOGGCCOGC IQO 

GGGCGCOOCT CCOOCIOCTG GAAOGOAOCX: 240 

ATCTTAATAar ^TGCCMTIGA TGCCTGCAAA 300 

GATGOOGAGA AACrTTGTTGG TWaAOTTAAC 360 

ATTCATTCAA GTGAT(3CrQA CTTCCAAATT 420 

ACTATTCTAI TGTGCXGGGA GAAGAGAAGT 480 

CaUUSAAAAGA AGAAAATA3T lXaK?m-XTO 540 

CATACTAAAO AAAAAGTTCT AAOGGGGGCC 600 

ATGCI1AGA2A ACTCCTTGG6 TGCTTTT0(31 660 

GCCCAAAACT ATACCATATA CXATTOCaTA 720 

AATTTATTTT ATGTGGAfSAG AGACACTGGA 7^0 

GAGOUSTATG AATCTTTTGA GATAATTGCC &40 

GAACTTOCAC TG0CC3Cr£AAT AATCAAAAXA 900 

ACAjQAAGAAA CTTATACTTT TACaUlTTTTT 960 

CAAGTGTGTG CTAC^CGACAA AGATOBGCCr 1020 

atcatigggc aggtgccacc ascacccacc 10 bo 

ATCAOCACAA CAiTChTCTCA f3CTAG»C2U3A 1A40 

AAAGTACAAG ACATGGATGG TCA6TATTTT 1200 

AACATTOATG ATGTAAATGA CCACTTGOCA 12fiO 

CTGGAAGAAA ATACAGTT6A TGT6GAAATG 1320 

GTOAATACTG CTAACTGSAG AGCTAATTAT 13&0 

TXTAAAATTG 1!AA£!AGATGC CAAAACCAAT 1440 

AATTA3GA2U3 AAAAOCAACA GATGATCTTO 1500 

OrCCAGAGAGO CTAGTCCAAS ATCAtK^CAlTG 1560 

GATCAGGATG AGGGCGCXGA GTGrAACCCT 1^20 

OCAGAA6TGG GAAC3UU2AAG CAATGOATAT 1600 

AGTGGCAjnA fiSTATAAGAA A3KAACXGAT X740 

ACAGGATCAA TCAAAGTrTX CM3AAGCCTQ 1800 

ATATATAATA TTACRGTC3CX TGCATC3VGAC 1360 

GQGATTATAC TTCAAOROGT GAATGATAAC 1920 

ATCTGCAAAC CCAGCA1GTC ATCIGCSGGAG 1980 

GKXGSCCGAC CCTTTQAlCTT TAGTCIGGAG 2040 

AGACTOAAAG CAAXTAATGA TACAGCftGCA 210Q 

GGCTCATATG TAGTACCTAT AACAGTGAGA 2160 

TTOQATSTTA CACTGTGTQA CTGCAaCTACC 2220 

AGGATTGGC3G OTGOAGGAGT ACAACTTGQA 2280 

ATAOCATIGG TCTTTTGCAT CCI6TZ3A06 2340 

CAAGCAAAAO TAATXCClGA. TGAHTTAGCX 2400' 

GCT0CIGGA6 ATGACAAAOT GTATICIGOG 2460 

TCTGCICWOG GAGTTTGTG6 CACOGTGQGA 2S20 
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TOVGGJ^TCA AAARCGGAGQ TCAGCSAGRCC 
TCOOAATCCT GCCGOOeGGC TGGCCACCAT 
ACGGAGGIGG ACAACTGCAQ ATACACTTAC 
CTTGOTGRAA AAGTGTATCT QTGTAATCAA 
5 GTCCltSACAr ATJUVCTATGA AGGAAOAOGSA 
GAACGACAAQ AAOAAGATGG GCTTGAA-TTT 
CTAGCi^GAAG C2^TGC3^TGAA GAGATGAGTG 
■TTTATGACIT TTAAAAAAAA TTACAAACCA 
TOaaGGTTTT TCTCTCATTA TTTGGATGGA 

10 AGAOVCTATA AACAAOTACA CAAATTTTTC 
TCTATCCAAa OAGGTCTACA GAGAAATTAA 
A'XGACAACAG CCAATTTATA 6TGCAATAAA 
7ATTTGUU3C ACAACCTAAT GGAAAA-r-TCT 
AATTAAGTGT TCATGTGGT6 CTTGGAAACT 

15 ACTGCATTCT TGCTATTATT TPATTCTTGT 
TTTCrAGCCR GGCfcTTQACT ATTACAATTT 



ATOGAAAIGG ^UGAAACSGAGQ ACACCAGAGC 25B0 

CACACCCTGO ACTOCTOCAG GGQAGOACAC 2640 

TGGGAGTGGC AC3U3ITTTAC TCAGOOCCGT 2700 

QATOAAAATC ACAAOOVTGC CCAAGACTAT 2760 

TGOGTOBCT6 GOTCIGXAGQ ^TGTaXSCAQT 2820 

TTaOATAATr TGG9W30CXAA ATTTAGGACA 2880 

T6TTCTAATA AQTCTCTGAA ASCSCAfiTGGC 2940 

AGTATTTTTT AAAGCACSAAQ ATGCTATTTG 3000 

ATCrrCTTTGG TCAAATGCAC ATTTACA0RS 3060 

AATTTTTACA TATTTTTAAA TEACITAICT 3120 

AGTCTGccrr ATTTorEACah. irrGGGTATA 3iao 

ATGTAATTAA TTCAAGTCCT TATTATTVGAC 3240 

AGASACCTTG CTTTARCATT ATCTCCAGTT 3300 

GTTGTTTTCC TGAACATCTA AAGTOTOTAG 3360 

AATGTQACCT TTTCACTGTS CAAAGGGAGA 3420 
CATT 



20 



9eq ID HOs 416 PrQtfein gcquenge 
ProtedUi Accession #s KP 077740.1 



1 11 21 31 41 51 

I 1 ! I I 1 

MBAARFSGSH HGALCStUiIiL TLAILIFAfiD AC3CNVTLHVP SKLDABKLVS RVSTLKECFTA 60 

ASXilHSSDSD 3?QIIiBDG5VY TTNTIU*S3E KHSPTILI^SN TEKQEKKKIF VPJLiEHCTKVI* 120 

25 KKRHTKEKtfli RRAKRRWAPI PCSMUSNSLG PFPIiFUQQVQ SDTAQWYTIY YBIKGPGVDQ 160 

EPRSHiFYVER DTGNI*yCrSP VDREQYESPE IIAFATTPDG YTPKLPIiPLI IKZEDEHEaiy 240 

PIPTBETYTF TIPEUCRVOT TVGQVCRTDK DBPDTOHTRL KVSIIGCV&P SPtltPSMHPT 300 

TGVITTT3SQ UlRBLIDKKQ WCIKVQfDMDG QYFGLQTTST CIIHXSDVND HLFTFTRTSY 360 

VTSVESmrVD VEIEiRVTVED KDLVNTAHWR ASYTILKGNE KONFKIVTDA XTKEGVIiCW 420 

30 KPIMYKBKaQ HILQIGWNB APFSHEZ^PR BAHSTATVTV IIVEDQDEGFB CHPPIQTVRM 480 

KEKABVGTTS MGyKAXDPET I.TDPTGKVT3 VSSCXGGIKW RSIiDREABiZ 540 

KHSnOJITVL ASDQOSRTCT GTLGIILODV IflDNSPPIPIOC TVIICKPTMS aAElVAAffilED 600 

BPZBaPPFDF fiLESSTSEVQ ENWRIiKAXRD TAAHLSYQNP PPPBSYVVPI TVRDRWMSfi 660 

VTSLEWPLCD CITENDCTHR VDPRIGGOGV QLGKHAHAl LLGIAU-PCI LPTliVCGASG 720 

35 TSROPKVIPD DLAQQNIjrvS NTEAPGnDKV YSAHGFTTOT VGASAlQGVCS TVGSQimnSG 780 

QETIBKVKGG HQTSESCSGA OEHHTUDSCR GGEITEVDNCR YTYSBNHSFT OESLGBSVYIi 840 

OfQDESnlKSA QDYVIaTYli^ GR6SVAGSVG GCBBRQBEDG EaBFUmZiEPK fRTXABAOAK 900 
It 

40 eeq ID NO I 417 UNA segnenee 

Kucleic Acid Accession tt: 29M_004943.1 
Coding se^ence: 202.. 27 4 5 

^^1 11 21 31 41 51 

45 I I 1 1 I 1 

CGCCAAAGGA AAAGCCXTCTT GGATGAGAGG Q^GGOGCTTC AGflfiAAQCTA M3AAAA6CAC €0 

CTCTCCOOGC GCXX3CACCTC crCX53CCTOG CSSCTGCICCT GAGCAGOOGG OCCMSACTOC 120 

GCIOCGGCOG OGQdCJCroGC CCOGCGOAGC CCTCCTACCC CGGCCOGAOO CTCGGCOCXSC 180 

GACC3GCCX:G GAGCCCTCTC C3VIGGAGGCA GCCOGCCOCT CCGGCTCSCTG GA AOGGA GCC 240 

50 CTCTGOOGGC TOCTCCTGCT GAOCCTCfiCG ATCTTAATAZ TSGCCRtSTEBA TOCCIGCAAA 300 

AA3QamCAT TACATGTTOC CTCCAAACTA QATGCGGAGA AACT1QITG6 TAGA0T1AAC 360 

CTGAAAGAQT G OriT ACAGC TGCAAATCEA ATTCATTCAA GTGATGCXGA CTTOCAAAITr 420 

TTGGAGGATG CTTCAQTCTA TACAAC3WUVT ACTATTCTRT TQTOCTOGGA GAA CTU^ ftgl? 480 

TTTACCATAT TACTTTCCAA CACTQAG7AC CAAGAAAAGA AGAAAATATT 1X3TCTTTTTG 540 

55 GAGCAICAAA CAAAGarCCT AAAGAAAAOA CAIACTAAA0 AAAAAGTTCT AAGGOGCGGC 600 

AAGAGAAGATT GGGCTCX^AAT TCCTT6TTG6 ATGCSAGAAA ACXOCTTGGG TCCTTTTOCA 660 

crmoCTTC AACACSOTTCA ATCTGACAOG GCCCAAAACT ATACCATATA CTATTCOVXA 720 

AGW3GT0CTG GAGTTGACCA AQAACCTOSG AATTTATWT ATGIGGAGAG AQACACTGGA 780 
AACTTGTATT GTACTCGTGC TGTAGATOGT GAGCA^TATG AATCITTTGA GATAATTGCC 840 
60 TTTQCRACAA CTCCSftGAtOS GTATACXCCA GAACTXGCAC TOCCSCCTAAT ftATC AAAATA 900 
GAGGATGAAA ATGATAACXA OOCAATTTXT AGAOAABAAA CXTATACTTT TACAA3TITT 960 

GU!\AAATTGCA GAGTGGGCAC TACTGTGQGA CAAGTGTGTO CXACIGACAA AGATGAGCXTT 1020 

GACACGAIGC ACACAOGCCT QAAGTACTCC ATCSVTTQGGC AGGTGCCAC3C ATCACCCAOC 1030 

GaMTTTCTA TTGCATCCaiAC TACAaOCBTG ATCAOCACAA CATCATCTCA OCTAGACfiGA 1140 

65 GAOTTAATTG ACAAOTACCA GTTGAAAATA AAASTAC3UW3 ACATOQATGG TCAGTATTTT 1200 

GGXCIACAGA CAACTTCAAC TTGTATC3UrT AACATTOaXG AX6TAAATGA OCACITGCCA 1360 

ACATTTACTC GTACTTCTTA TGTGACATCA OTGGAAjSAAA ATACAGneA TQTGGAAATC 1320 

TTAC33AGTTA CTGTTGAGGA TAAGQACTTA GTCAATACTG CTAACTGGAS AGCTAATTAT 13 BO 

ACCATTTTAA AGGGCAATQA AAATGGCAAT TTTAAAAJTTG TAACAGATQC CAAAAiC XaAT 14 40 

70 QAAGGAGSTC TTTGIGTAGT TAAGOCTTTG AAXTATGAA0 AAAAGCAACA GATGATC7T6 IS 00 

CAAATTGGTO TAGTTAATGA ASCTCCATTS TCCnOfiGEAlGG GTAGTCCAAO ATCAGCCATG 1560 

AGCACAGCAA CAGTrEACIGT TAATGTAGAA GATCAGGATO AOGQCSOCTGA GTOTAACCCT 1620 

CCAATACAGA CTGTTCGCAT GAAAQAAAAX GCAGAAGTGG GAACAAC3UU3 CAATJOGATAT 1G80 

AAASCATATG ACCOVSAAAC AAGAAGTAGC AGTGGCATAR QQTATAAGRA ATTAACTQAT 1740 

75 OCAACAGGGT GG6TCACCAT TGATGAAAAT ACAGOATCAA TCAAftQTTTT CAGAAG0CI6 1800 

GATAGAGAGG CAOAGAOCAT CAAAAATGGC AlATATAATA TTACAGTCCT TGCATCAGAC 1B60 

CAAGGA0GGA GAACATGTAC GGGGACACTG GGCSVTTATAC TTCAAGAOQT GAATGATAAC 1920 

AGCCCATTCA TAOCTAAAAA GACAGTGATC ATCTGCAAAC CCSVCCATGTC ATCTGCGGAG 1980 

ATTGTrOCGG TTGATCCTGA TGAGCCTATC CATOQCCCAC CCTTTGACTT TAGTCTGGAG 2040 

80 AGITCTACTT CAGAAGTACA QAGAATGTGO AGACTGAAAG CAAITAATGA TAC3GC&GCA 2100 

CSGTCTTTCCT ATCAGAATGA TCGTCCSiITT GSCTCATflOT TauaTAGCTAT AACAGTOaGA 2160 

GAXAGACTTO GCATGICTAS TGTCACTTC3L ITGQATQTTA CACIGIGTGA CTGCATTACC 3220 

GAAAATGACr GCACACATCS TBTAGATOCA AGGATTGGOG GTGQAGaaBr ACA ACT TGQA 2280 

Ai^GTGGGCCA TCCTT G CAAT ATTGTTGGGC A13U3CA3nOC TCTTTTGCAT CCTGXTTACQ 2340 
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10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



CTGGTCTGTG 

AATGOCTTCA 
TCAGGAATCA 
TOGGAAO'CCT 
ACQGAGGTGG 
CTTaOTQ&AG 
GTATCTGTGT 

AGA1GGGCTT 
CATGAAGAGA 
AAAAAATTAC 
TCATTATTTQ 
AOTAC&CAAA 
TCTACAQAGA 
TTTATJVSTGC 
CCTAATOGAA 
GTCSCSTGCCTG 
ATTATTTTAT 
TTGACTATTIA 



GQGCTTCTGG 
TAATTGTATC 
CRACCCAAAC 
AAAACGGAGO 
QCCSGGGGGC 
ACAACTGCAG 
AATCCATTAG 
AAICftAGATG 
AC»GGATC6G 
GAATTTTTGO 
TGAGTOTSTT 
AAACCAAGAA 
GATGGARTCr 
1TTTTCRATT 
AATTAAA&TC 
AATAAAATX3T 
AATTQTAGAG 
G7AACTGTTQ 
TCTTGTAATG 
CAATTTCATT 



GACGTCTJAA 
AAAC3U2AGAA 
TGTGGGGSCX 
TCAGGAOACC 
TGGCCACXaO' 
ATACACTTAC 
AGGACACAJCI 
AAAATCACAA 
TGGCTtSGGTC 
ATAATTTGGA 
CTAATAAGTC 
TTTTTTAAA6 
CTITGGTCftA 
TTTACRTATT 
TGCCTTATTT 
AATTAATTCA 
ACCTTGCTTT 
TTTTCCTGAA 
TGAOCTTTOC 



CTkACCAAAAG 
OCTCCTGGAG 
TCTGCTCAGG 
ATCGAAAIIGG 
CACACCCTGG 
TCGGAGTGGC 
CTGATTAAAA 
GCATQCCCAA 
TOTAGGTTGT 

gcccaaattt 
tctgaaagcc 
cagzvagatoc 
atggacattt 

TTTAAATTAC 
GTTRCATTTG 
AGTCCTTATT 
AACATTATCT 
CATCTAAAQT 
ACTGTGCAAA 



TRATTCCTGA 
ATGACAAAGT 
GAGTTTBTGB 
TGAAASGAGG 
ACTCCTGCAG 
ACAGTTTTAC 
ATTAAACftAT 
GACTATGTCC 
TGCAGTQAAC 
AGGACACTAG 
AGTGOCrrTTA 
TATTTGTGGG 
ACAGAOAeAC 
TTATCITCTA 
GGTATAKIGA 
ATAGACTATT 

ccw5ttaatt 
otgtagactg 

GGGAiQATTTC 



TGATTTAGCC 
OTATTCTGGG 
CACOGTGGGA 
ACACCAGACC 
GGGAGSACAC 
TCAGCCCCST 
GAAAGAAAGT 
TGftCATATAA 
GACA2U3AAQA 
CAGAAGCATG 

GGTTTTTCTC 
ACTATAAACA 
TCCAAQQAG6 
CAACA60CAA 
TGAAGCACAA 
AAGTGITCAT 
CATTCTTGCI 
TAGGCW30CA 



2400 
2460 
2520 
2580 
2640 
2700 
2760 
2B2a 
2881} 
3940 
3000 
3D«0 
3120 
3180 
3240 
3300 

3420 
3480 



Seq ID HOi 418 Protelp Beguence 
Protein Accession ft: MP 004940.1 



1 
I 

MBAARFSGSW 
AHIiIHBSDFD 
ICKBHTKBKVI) 
EPRNLFYVER 
PIFTEETYTF 
TGWITTTSSg 
VTSVEESirVD 

KEHAEVGTTS 

KHGnnrixvi. 



11 
1 

BGALCRLIiLb 
FQIXiEDQSVir 
HRABRRWAPZ 

rrraML^CTEP 

TIFENCKVGT 
U3RSL1DKYQ 
VEILRVTVED 

NGYKAYDPET 
ASDQGGETCT 
8LE8STSBVQ 



21 
I 

TIAJItlFASD 
TlNTZItLSSB 
PC5HI£BiSLG 
VDREQYE9FE 
TVGQVCATDK 
IjKIKVQDMDG 



VTSLDVTJjCD 
TSKQPKVIPD 
QETIEMVHGG 
OHTLXIQI 



DLAQQNLIVS 



APFSREASPR 
RSS&GIRYKK 
GTLGIIW2DV 
KMWRLKAIND 
VDnUGGGSV 



31 
I 

ACKNVTLKVP 
KRSFTlLliSH 
PFPIiFLQQVQ 
IIAPATTPDG 
DEPDTMHTRL 

oypaLOTTST 

AlOYTILKSHE 
SAMSTATVIV 
LTDPTGWVT3 
NENSPFIPKK 
TAAKItSYOND 
QDfflCKAXLAI 
YfiANGFTTQT 



41 
I 

SKLOnSKliVG 
TBHQFICKKIF 
SDTAOim*IY 
YTPEZiFIiPI*! 
KYSIIGQVPP 
CIIWID0VMD 
NGMFKIVTOA 
NVEDQDEGPE 
DBCTEGSXICVF 
TVIICKPTHS 
PPFGGYWPI 
LLGIAUiFCl 
VGASftQGVOG 



51 
I 

RVKUKBCPrCA 
VPIiEHQTKVli 

ysxhqfgvdq 

IKIEDENDNY 
SPnjFSMHPT 
HLPTfTRTSY 



CNPPIQTVRM 

&AEIVAVDPD 
TVRDRDGHSS 
LP*££VCGJ18G 



Q^KLGEESIR 



Seq ID ZflOf 419 DNA aequenee 
UtiGlelG Acid Accession #: NMM)02722. 
Coding sequence: 14.. 301 

1 11 21 31 

1111 
ACTCIGSACr COGGATGGCT GCOGCACGCC 
GCGTGGCTCT GXTACTACna CX3VGTGCTGO GTOCCCAfiOG 
ACCCRGGGGA CRATGCCACA OCAQAaCAGA TGGCCCAQTA 
ACATCAACAT QCTOACCAGG GCTAGGTATG GGAAAAGACA 
TCTCGGAGTO GGGGTOOCOG CHTQCCGCT6 TGCCCAOOQA. 
AlUTGOCSkOCT TCtSXGTOCI AGGACTCCAT OAGGZUSGGCC 
ACCCTTGGCT CTGaOCSUULS CTTGCTOCCT GCTCCCACAC 
AAGCC 

3&i XS vtOx 420 Protein gegqence 
Protein. AccesBlcm tfi np_0027l3,l 

1 II 21 31 

I I I I 

KAAABLCLSI* LIiLSTCVALI* LQPrJdGAQQA 
TRVBOBGKBHK EPTIMFSEOQ SPHAAVPKEL SPIJSIi 

fieq ID NO! 421 EMA Beguence 
»ucleic2 Acid Accession #; HM_03254S.l 
Coding seQ^ences 4fi..7lB 



41 
I 

CCXdCTGCTC 
AQOCCXSiCTG 
TGCAGCTGAT 
CAAAGAGQAC 
QCIChGGOCG 



60 
120 
180 
240 
300 
3G0 
420 
480 
540 
600 
660 
720 
780 
B40 



51 
I 

CT6TOCAOCT 
QAGCCAGTGT 
GTCOOTAQAT 
ACSGXITGGCCT 
CT8GACTTAT 



AOBCTCaWTA AAGCAAiSirCA 



41 51 
I I 



1 
\ 

AAACIGATCr 
GCATOTCAGG 
CTATCAAAGA 
GCACCCSACAG 
CQAGGGCT6G 
CGCGGGQCGG 
CCOGGCCCSUZ 
CCTGGAGCAC 
C5C1GCACTOC 
OCACGCTCAC 
ACrCCTGO^ 
CGTCCIOCA6 
TICTATGTTG 



11 

1 

TCaATGCACT 
CTTCXGTTTA 
GT^GAAACATA 
TCACCGCTCA 
GGGCCGGAGG 
OaCTGClGCA 
TTCAOC986CC 
GGAGCCTGGA 

GGGOCQAGCG 

OGGGAGCQGC 
ZAAATAATftG 



21 
I 

AAGAGAAGGA 
GGGXCAGTTT 
AOGGGCaaTAG 
ACTGGACCTC 
AGCOGCTCCC 
GGAA06G0GG 
GCTACTGCGA 
CCCTCCGCGC 
AQAOGCCTGA 
CCGGOCaoCGC 
QCCCQStOGC 
GCCCCTQCGK3 
ATGrTGTTTAG 



31 
1 

GACTCICAAA 
GGCATIACAG 
AGAGGAAGTC 
CAGIOlTTrC 
CTACTCCCGG 
TACCTGCGTG 
GCATGACCAG 

cTGcxaccrc 

GOSCTGTGAC 
GCOCaGCCTG 
GOOOQCGCAC 
AAOGOOGGGA 
TTXIkOOG3%A 



41 

I 

CCAAAAATGA 
ATCATCAATT 
ACCAAGGTIG 
GGAGAGGTQA 
GCTTTCGGAG 
CTGGGCAGCT 
AGGOGCAGTG 
TGCAGGTGCA 
CxrOAAAGACT 
CTACTCTTQC 
CCTCJOGTCCC 
CTTGGGCATC 
GCT6AASCAC 



60 
120 
180 
240 
300 
360 
420 



60 



51 
i 

CCTGGAGGCA 
TQGGAAACAG 
OCACTCSiGAA 
CTGGGAGCGC 
AGGGT0C3QTC 
TCTGCXSTGTG 
AATGCOacGC 
TCXTCGGGQC 
TCCTGGCSCTC 
TGCCCTQCGC 
TGGTCCCTTC 
GGCTTTAATT 
TGGGTGAATA 



60 
120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
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TTTTTATTGQ GTA2VTAAATA TTTTC»IGAZ^ AGGGCCftAAA ftftAAAAAAftA AAAAAAAAAA 840 



Seq ID NO: 422 Protein sequence 
5 protein Accession ft: 1IP_1L5934.1 

J 11 21 31 41 

\ \ \ \ \ 

MTWRHaVSLt FTVSLALQII MLtSMSYORBK HHGGREEVIK VATQKHRQSP 
10 VTQSAEGWGP EEPLPYSTOVF GBGASftRPRC CHNGGTCVLQ SFCVOPMrr 
SKCGALEHGA WTLRACHLCR CIFGAIiHCLP LQTPDRCDPK DPIiASHABGP 
Ll>PCAIiIiUR£r liRPDAPAHFS SLVPSVLQRE ItRPCGRPGLO HRl* 

Seq ID HOs 423 DNft seqiieaice 
15 Nucleic Acid Accession #: 13H_006533.1 
Coding sequ^ice: 72.. 467 

1 11 21 31 41 51 

«^ I 1 I 1 1 I 

20 AGGGAQAQAG GGAQQQGAOa ARATTGGAGA COCCAGCACC CXICTTGCTCA CTCTCTTOCT 60 

CACASTCCAC GATOGCCCGG TCCCTQGTt?! GCCTTGGTGX CATCATCTTG CTGTCTGCCT 120 

TCTCXXaoaCC TGGTGTCAOQ GGTGGTCCTA TGCOCAAGCrr taSCTGftOCGtS AAGCIGTGTG ISO 

CGGAOCAGGA OTGCAGOCAC CCTATCCCCA TGOCIorGGC CCTTCAGGAC IftCAT SQC CC 24 Q 

GOQACTGCae ATTCCTQAOe AXTCACOSQQ GCCaAGTGOT GTATGTCTTC TCCAAQCTOA 300 

25 AGQGCOC3TOa GOGGCTCTTC TOGOCaAGGCA GOSTTCAGGG AQATTACTAT GGAOATCTGG 360 

CraCTCGCCT GGGCTATTTC GCCAGTaSCA TTOTCCaAGft GGACCAGACC CTGAAACCTQ 420 

QGAAAQTCBA ICTGAAOACA GAilZAAATGGa ATTTCTACTG CSCftGTGAfiCTF CftaCCTACCG 480 
CI6GCCCTGC CHTTTCCCCT CCTTGdSTTT ATOOUUlTAC AATCW3CCCA 6TGCAAAC 

30 Seq ID NOi 424 Protein Begocnce 
Pgotein Accession #i S1PJ006SZ4.1 

1 11 21 31 41 51 ■ 

1 I I I I I 

35 KARSLVCL6V IXLI^SAFSOP GVRGGEHPKIi ADRKLCWDQE CGHPISKAVA ZjQDVMA^Ca 60 

n/TIHRGCfW YVFSKLKSRO ItXiTHGGSVQG DyXODIAARL GYPFSSlVRE DQTUCPaEaD 120 



51 
1 

GRYCEEIPQSR 120 
SAGQAPSUiL 180 



Seq ID 2lOs 425 aequence 
40 Haclelc Acid Accession Wt HM_0B0870.1 
Oadlng seijuence: S..71D 

1 11 21 31 41 51 

45 IgATGACACA AGTCACftGAA AAGTOCftCAG AACACXXMA AAROACCAOG TCRACCACRG 60 

AOftAAACCAC AAGAACCCCA GAAAAGCCXA GGCTATACTC ASAGAAGRCC ATATGCAC5CA 120 

AAGGGAAAAA CACAOCAGTC CCAOAAAAGC CTACAGAAAA CCTGGGGAAC ACCACACIGA 180 

CXSUTTGAGAC CATAAAAGCC OCAGTAAAGir OCZVCAOAAAA OCCAGAAAAA ACASCAOCM 240 

TCACAAASaC TATAAJUM3CT TCAGTCRftGO TGWCRGCTGA CAftATCTCTC ACTACTROCT 300 

50 CTTCTCATCT AAATAAAACT GAAaTTACTC ATCAGGTGCC CACXGOTTCTr TTCACJOCTCA 360 

TTACATCTAG AAOGAAGCTQ AGTTCTATCA CRTCAGAW3C CACAGGAAAC OfiGaSOCATC 420 

CATAOCTCAA TAAAGATGGC TCACAGAAAG CTATOChOGC HGGAC34lSA*r6 GC aaAGAAI G 480 

ATTCa-TTCCC TGCATOTOeC AtAGlTATTG IGOTCCrGOr GGCTGTQATT CTGCICCTQO 540 

TGTTCCTTGG OCTGATCTTC TTGOTCrCCT ATATGArTGCB OACAOSOCGC ACACIAAGOC 600 

55 AQAACAOOCA GTACAATGAT GCASAGGATO AGQGTGGCCX! CAATTCCTAC OCGGTCTACC 660 

TGATQGAGCA GCAGAATCTT QGCATGGGCC AQATCOCTTC COCAOGGTGA TCTTQGAGTA 720 

GGCOCaaflC OCTGGCTCTT CXaTOCTCM CXXXTTTTCSCT GGATOAGGAA COGSACTCAC 780 

AAITrCTAXT TCGGGGACm CAOQAAGGGC AjB&OAATChCT GACGGTTftCC AOTATTAAGG 840 

CTTCATCIGT TCTTOAaftCT GGTTQGOaRA rCGAGGTOATA AGCAaaOAGG eTGTAAGTTT 900 

60 AGGGGA(3RAA GAAGAAAGAA TGAArCAATAC GAGCftSACAT TCTCTGTAjGA AaOTAAlGGT 960 

CTGAGAAT6A AAAfiGTGTTT eATOOACATG TTOTQGQeGC ACCAATGCAG AACACTGCAC 1020 

TOftQTCCTRA AGGAAQQACA GGftGOCTTAT AGGCftATGCJC CCAGACTGAC TTGTGaGTGQ 1080 

GGrTTATGGG QftAAGGC^GO QAJCXQAGGGC AGASTCTCIG GGTTTCAGGA CAGCATTAITG 1140 

VJU arri XXXI TCACTATTAC TTAAOAGTrT GT6TGTAAAC AGGCTCATCT CTGAGTTCTC 1200 

65 AGGACCCTTG CCOCCACCGC CAJTOTTTTA ATGAAAAAAA AAAACAAAAA AAACOGATCC 1260 

AAGAAGAAAA GAQAATTTAT TTCCTTCfCC ACTCTCTCCA TQCOCTGGAQ AAAAAAAAOT 1320 

OCA0AAGAAA TCATAAATAT CrPCTCATCTA CATGGTTQCr TCCTCTTCCT OCCAAATCOC 1380 

TOMSTTTTOC TAAATCTCTA CAQaOGAOQC OCTOTTGGSrT TCGCTTGCEa GGTTGIGGar 1440 

GSACACXKSU^ GGAGGGOATT TTTATITGGC CRGCauSTCTC AOCO^CTGAV CTCCACCOCA 1500 

70 GACCTTCCCT GATT6GTGTC TCAC3<3CTTTA TTTTC3CTQTC TCTTOCACCA AARGGCAGCT 1560 

GTAGCTTTAT CTOSTAAAAQ TTACCOCTCT TCTCTACTGT OOCCATlXrCC TXTTCCTCCCR 1620 

CCTTCAOCCC AGATTCA2M3r TTTOCTCCTT GTAOGCATTT CAtCTGTOTG TGTTTTCTGG 1680 

ATTTTCTCTC T CJCi T Ul'T A TGGCXrATTTC ACC3TATTRC TGATTI3GSIA GAGGGGGAAA 1740 

AQOAGRATGA T<3ATGATAGT TTCCTTCTGT CTATTOACCT TTTTTATAAT AAftGTATAAC 1800 
75 ATOTT 



8sg XD nos 426 Protein seoreience 
Protein Acceaslon #= 1IP_S43146.1 

80 1 11 21 31 41 SI 

1 I I 1 I ) 

MTQfVTBKSTB HPBKTTSTTB KTTHTPBKPT LYSBKTICTK GKSTTFVFBKP TStSUSSPTCUT 60 

TETZKAPVK8 TBHPBKTAAV THTIKPSVIW TGDKSLTTTS SHUflKTEVTH QVPTGSPTtr 120 

TSRTICIiSSZT SEXSaSBBBP YU9KDGSQKG IHAGQMGBSni SPPAHAIVIV VLVAVZLIiLV 180 
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-FUSCTFVVSY MMRTRRTLiTQ WJQWDRBDE GGPHSYPVYIi MBQQHLGKGQ IPSPR 

Seq ID KO: 427 DNA seooence 
MucLelc Acid Access Ion fls XH_0€94B0.1 
Coding s equence i 1 « . 43 03 

1 11 21 31 41 51 

ATGGACACTQ TGCTGGTOCT GCTCCTGC3GC CTOCRGtSCCT TG3CCGGACC CAGTOCGAAQ 60 
COOCAGa7«5G ACTCTGTCrC AGACTGGGCX: ATTGTGTTOA TCACTCTCAC TTTQCyrOGCA X20 
OCAATTGTCA GCCTAATGTA CSOtJTATCAAG AAGOCCrTGOC AGTTCCGGAG GQAGATGAGT 190 
CTGOOGTSnS GCTQTGCSCrC TGTGRCCCCT TACAGCftGCC ACCATSAGOG QGAGGCTGCC Z40 
AGCCMGGCT ACTCT3?GrCA AATGAAAGCT TCTTOQGGGG CftOaTGCTAC TACATTCCAA 300 
GAATATCAGA AAACTOGQGA ACTCTCAACA TOCSSATCACA TATTPCCCM^T CACTCCRGGC 360 
CTTGTTTATA GTATCCCTTT TQATCACATT GTTCTGGATT CAGOACAAAG ACCTCCAGAQ 420 
CTCCCTAAAT CTACAOAAAT CCATGAGCAA AAACX3CCACT GCMlCMXAC AGGCCATTCT 480 
AAGCXaACTG ACAAfiCCTAC AGC5CAACTCC AAAACTATAG ACCACftAAAG CTCTACfiGAT S40 
AHTCATOASG CICCTOXAC TTCraaaGAA AACCOCROCA ACCRAGGOAA AGaMXX»ATG 600 
ATCCGGAAOC AGCGCTCTGT TGATCCTflCT Q&CTCCACTA OCACACaTAA AGAATCOGCT 660 
GGAAAAAAAC ATATAACtSCC AGCACCCARG AGCAAAATAA ACI GTCG TAA GTCCACAACA 720 
GGCAAATCAA aSGTAACAAG AAAATCAQAT AAAACTOGAA GACCTTTOGA AAAGTCCATG 780 
AGTACTTTGS ATAA{3ACAAG TACCAGCTCA CATAAGACTA CAACTTCCTT CC3iCAACrCA 840 
OaCAATTCRC AQACCAAGCA AAAAftiSC3U2A TCTTTTOCaG AAAAAATCAC AGCRfiCXJTCA 900 
AAARCAACAT ACAAGACJCAC AGGAACXXXA BAAGAGTCAS AAAAAAC3X3A AGATTCCftGA 960 
ACAACftGTTG dCTCAGACAA GCTCCTQACA AAAACTACay^ AAAACATACA AOAGACCATA 1020 
TCftBCCAATG AGCTCAC3^ ATCTCTAGCA QAGCCTACAG AACATGOWSG AAGOACRSCC 1080 
AATGAG»ACA ACACACCATC OCXAQCAGAG CCiaCS«SAAA ATRGAGAAAQ GACAGCCRAT 1140 
GAOAACauCCA CACTATCOCC 3«SCftGAGC3CT ACftSAAAATA GflSAAnSGM A<3CCAATGAG 1200 
AACACCOCAC CATTCC3CaflC AGCSOCCTRCA <3AAAATAGAQ AAATCACAOC CAATGAOAAT 1260 
ACKACACTAT TCCCflfiCAGA GCCTACAORA CATGGftCAAA GGACRGCCAA TGAGAACACC 1320 
ACAOCATCCC CAGCAOAjGGC TACRGAACAT GGAGAAAGC3A CRGOCAATGA GAACACTACA 13 BO 
CCATC3CCCAG CJ^GOCTAC AfiAACATQGA GAAAOOACCC CATTTGGCRA TBACAAAACC 1440 
ACATCATCCT CAGCaGAGTC TACAQAACAT BOftGAAAGGA CDCCACTGOC CMCGW5AAC 1500 
ACCA<aCCAT CCCCAGCAGA GCCIACAGAA AATM»GAAA GGACttGGCAA TQAQAACAOC 1560 
ACACCATCCC CAGCAGGGCC TACAOAAAAC AC3AGAAACC5A CftGCCAACflA GAAGACX»CA 1620 
CTATCCOCftG TAGASCCTAC AGAAAATAC3A GAAACAACAG CCAATGAGRA OaoCACACCA 16B0 
TGCCCAGC3US AGOCTACRSA AAATQGACAA AOOACCOCAT TXGOCAATQA GAAAACCACA 1740 
TCATCCTCftG CRQAQOCTAC AGAACROTSA GaWAOSftCSOC CACTGGCCAA TGAGAACACC IBOO 
ACftOCA'JCCX: CRGCAOAQCC TACftfiAAAAT AfaM3AA»QGn CRGCC AATGA GAAGACCACA 1860 
CCATCCCCAG CAGAGCCTAC AGAAAATC343A GAGROQACTC CTTrGGCCJiA TOftOaAGACC 1920 
AOGCCATCTC TAGCAGAGOC TACAGAAAAT OGACAAAGGA DCCCATTTGC CAATGAOAAG 1980 
AC3CACATCAT CCICRGCAGA GOCTACftlSAA CaiOGAAGAAA GGACTCC^ACT GGCCAATGAG 2040 
AACAOCaCAC CATOCOOSGC A(GaWC3CXACA OAAAATAOAO AAAGGACAGC CAATCftOAAC 2100 
ACCaCACCAT OCXXSWSCAGG GCCTACWGAA AATftGaSAAA TGACASCCAA OGAQAAGACC 21€0 
ACACTATTCC CAGOVGAGCC TACAGAAAAT AGAQAAAGGA CAjQCCAATGA GRBfiACCACA 2220 
TCATCCCCW3 CAjOAOCCTAC AOAAAATGGA CAARGGAOCC CATTTOCCAA TGiWIAAAACC 2280 
ACATCATCOC CAGCRGABCC T&CAGAACAC GOAQAAAQGA CCCCACTGGC CAATGAGAAC 2340 
ACCACACTAT CCC3CnGCRGR QOCTACASAA AATAGAQAAA GGACftOCCAA TGAQAAGAOC 2400 
ACACCATTCC CMCAflaGCC TACBOAAAAT AQAlGBARGGA CASCCRATOA GAACACCACR 2460 
OCATCCCCAG CACAGCCTAC AfiAAAATGQA GACAGGACTC CATTGGCCAA TOAQRAGACC 2S20 
ACACCATCTC TAGCAGACCC TACAOAAAAT OOAAAAAGGA CCOCATTTQC CaRTGAGAAG 2580 
ACCACATCA7 CCTCAGCAGA GCCIACftQAA CAOGCASAAA GQACTCCACT GGCCAATGAQ 2€40 
AACACCaCAT CATCCCCAGC AQZU5CCIACA GAAAATAGAG AAfiGGACT^ CAATGAflAAG 2700 
ACCAOiCftAT TCSOCaGCAOA GOCIWMAA AATMaOAAA GCACAGOCAA TOAGAAGAOC 2760 
ACACCATTCC CAQCAGAGCC TACAGAAAAT AfifiGAATGGA CRGCCAATGA GAACACCRC& 2820 
CTATCCCCAIQ CAGAGCCTAC AGAACATGftA GAAATGACCC CATTGGCCAA TGAGAAGACC 2880 
ACACX»aa2C CA0CA!3AGCC TAOVG&AAAT QGnSRAACOA CCCCATTTAC CAATGAI3AAG 2940 
ACCACACCAT CCTCftGCAlBA GCCCACAfiAA CATGGAfiAAA GGRCOCCACT GGCCRATGAG 3000 
ATCACCACAC CATCCOGAGC AOaSCCTACA GAACATGOAS AAAGCJATAfiC CAATOftGAAG 3060 
GGCACAOCAT CCOCRGCAAA GOCTACAGAA CATQGAGAAA CGftCAGTCAA XGRGGRCACC 3120 
ACACCATCCT CAGCAGAGCC TACAGAAAAT GGAQAAAGGA CCCCACTGGC CAATGASAAC 3180 
AOCACAACAT CCOCAACAGA OTCTACAGRA CATGGAJMA GSACAGCCAA TGAGAAQACC 3240 
ACAOCATCCC CReCRfiftOCC XAOUaAAC&T GGAQAAAGGA CACCATCAGC CRATGAGAAG 3300 
ACCATACCAT CTCCAGCAAA GCCTACKSAA CROGAAGAAA TGACCCCATC QGCCAATGAG 33fiD 
AACACCACAC CATCCCCACW AAAJGCCTACA GAACATGGAG AAAAGACTAC ATTGGOCAAT 3420 
GAOAAGATCA CACTATCOOC AGAAGGGOCT ACAGAACftTG GAGCAAAAAC TACSTCOGCC 3480 
AATGAGAAGA TCRCAOCATC OCTAQOUUC CCEACAGAAC ATGGAGAAAG GAOCAGATOV 3540 
CCCAAIGAOV AGATC3(CCTC ATCTGCASCA GZUSTCIACAG AACATAGAGA TAGGGCTACA 3600 
TCROCCAATG TGATCACAOC a0OO«3CaGCA GAflOCTATAA AACATGCAAA AAGGACCACA 3660 
TTGGCCCATG AGAAGATGAC ACAAGTCACA GAAAAGTCCA GAGAACAOCC AGAAAAOACC 3720 
ACGTCAACCA CAGAGAAAAC CACAAOAACC CCA)3RAAAGC CTACGGEATA CTCAGA6AAG 3780 
AOCATATQCA CCAAAGGGAA AAACACAOCA GTCCCAGAAA AGCCTACAGA AAACCEeOQG 3840 
AACACCACAC TGACCftCTOA GACCATAAAA GCCOCAfiTAA AGTCCACRGA AAACOCAGAA 3900 
2AAACAGCAC CftGTOlCAAA QACTATAAAA OCTTCAGTCa. AGGTCACAQG AGACAAATGT 3960 
CTCACTACTA CCTCTTCTCA TCTAAAIAAA ACTQAAiSTTA CTCATOWSGT GCCCACTQGT 4020 
TCTTTCAGCC TCATTACATC TAGAAOGARG CTGAGTTCTA TCACATCAGA AG0CACM36A 4080 
AACGAGAGCC ATCCATACCT CAATAAAQAT GGCTCACAGA AAGQTATCCA CQCCQGACAQ 4140 
ATGGGAGA6A ATOATTCATT CCCrSCATOQ GCCATAOTTA TTGTGGTCCT GGT<3QCTGTG 4200 
ATTCrCCrCC TGGTGTTOCT TGGCCIGATC TTCTTGGTCT CCTATATGAT GCGGACACQC 4260 
CGCAGACM CCCAGAACAC CCftGTOCAAT 6ATGCAGA6G ATCAOaSTGG GCCCRATTCC 4320 
TACXMGGTCT AOCTSATOGA GCAI3CAGA&T CETGGCATQO GCCAGATCCC TTCCOCAOOG 4380 
TGA 
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Seq ID NOi 428 Protein Beggtencs 
Protein Accession #s xp_0694BO-l 

1 11 21 31 41 51 

5 I I ] 1 \ I 

MDTVLVLliXJ LQALAGPSPK PQKDSVSDWA IVIilTLTLVA AXtf&LKyelK XAOQFRHEMS 6D 

LC5CG0GSVTP YSSHHEGEAA SQRYSCQMICA SWGftaATTPQ EyQKTGSLST 50HIPPX.TPO lao 

LVYSIFFDHl VLHSGQRPPS IjPKSTEZHEQ KRHCNTaSHS KPTDKPTGWS JCTIDHKSgTD IGO 

NHEAPFXSBE NSSBTQQKDPM IKHQRSVDPA DSTTTHKESA GKKHITPAPK SKIHCKKSTT 240 

10 GKSTVTRKSD KTGRPLBKSM STLDKTSTSS HKTTTSPHMS GNSQTKQKST SgPEKITA AS 300 

KTTYKTTGTP EESEKTEDSa TTVASDKLI/r KTTKNIQBri SAHELTQSIA EPJ^RDGRTA 3fiO 

HEiailTPSPAE PTEBIRERTAK EBITTLSPABP TEHRERT2W NTAPEPAGPT EKHEMTAHEH 420 

TTItPPABPTB HGERTANENT TPSPAEPTBH GERTANENTT PSPAEPTBHG ERTPFANDKT 480 

TfiSfiAESTEH GEftTPIpRJBBM TTPfiPABPTE NRERTANKNT TPSPAGPTEN RBTTAlTEKTT 54 0 

15 I>3PVEPTEHR ETTAKEKTIP SPAEPTESIGQ RTPFAMEKTT SSSAEPTEHG ERTPXANEHT 600 

TPSPAEPTHU HKRTAMEKTT PSPAEPTQI6 DRTPIAHEKT TPSEiftfiPTEH gQRTP EAMKC 660 

TTSSSIAEPTB HEERTPIANE NTTPSPAEPT EHRBHTRHSI TTPSP flgPTE XmSMTAHEBS* 7Z0 

TLFPASFTEK RERTANEKTr S8PAEPTEHG QHTPPANEKT TSSPAEPTEH- GERTPLANEK 7 BO 

TT&SPA£P^ NRERTAMSKT TPFPAEPTEBI REHTANENTT P&PAQPTEIinS DRTPWUIElCr 840 

20 TPSLAEPTBN GKRTPPANEK TtfiSSAEPTE HAESTPXiANB HTTSSPAEPT EMHBia 'ANSK 9O0 

TTQPPAEPTE NRESTANEKI TPPPABPa?EJSI RBWTBlJEHTr IiSPAEPTEHE EHTPLKNBKT 960 

TI*SPABPT2H OBRrPPTNEK TTPSBABPTB KGERTFIiABlB ITTPSRAEPT ESBERtidSEK 1020 

ATPSPAKPTE HGETTVHEDT TPSSREPTEN GBRTPLAHEM TITSPTESTE EGSRTMVEKT 1080 

TPSPABPTEH GERTPSflUBK TIPSPflKPTE HBEHTPfiANE WTTPSPVKPT BHQBKTTLAN 1140 

25 BKITIiSPBGP TEHOAKTTSA HBiaTPSLMC Pr EHGE RTTS PKDKITaSAA BSTEHRDKAT 1200 

BANVITPAPA EPIKHAJtRTT LAHEKMTQVT EBC93SHPEECI TSTTBIdTRT PBgP ILYSB C 12 60 

TICTKSKMTP VPBKPTBNWJ HTMpTTETIK APVKSTENFS KTIUWTICTXK PSVKVtGDKS 1320 

tiTTTSSHMK TSVTBQftfPTG SFTIiITfiaTK JjSSlTSBRTO HBSHPTOIED G8QK6ZS1AGQ 1380 

MOBADSFPAM AlVIWIiVAV lUniVFLQLl Fl»VSYHMRTIl UTItTQKTQm PABDBG(^S 144 D 

30 YFVYLHBQQK LGHOQlPEPIt 



Seg ID NO: 439 DMA gequencfe 

Nisclelc Acid AccesBlon «t FGEKE&H predicted 

Coding fiAguence z 1 . . 10674 

1 11 21 31 41 SX 

ATCTOGCCTC GCCTOaCCTT TTGTTGCTGG GQTCTGGCGC TGGTTTCGOO CTGGGOGACC 60 
TTTCftGCAGA T6TOC30CQTC GCGCPATTTC AGCTTCX3t3CC TCTTOCCCGA GACCGOGCCC 120 
40 GGGGOCCCSQG GGA0TATCCC COOiCCQCCC GCTC3CrGG0a AOSAaGCJeGC GGGGR GCAflA 180 
QTGGAQOGGC TGGGOCftOGC GTTCOGGC3QA GQOGTQCGGC TGCTGCGGGA GC1C3USOGAG 240 
OGCCTQGfiGC TTGTCTTOCT GOTQGATGAT TOQTCCAGOG TGGGOGMVGT CAACTTCCGC 300 
ASCGAGCTCA TOTTCGTCOG CAAGCTGCTG TCOGACTTCC CCGTGGTGOC CACOQCCaCG 360 
OGOGTQGCCA TCGTGAtJCXT CTCX3TCCAAG AACTACGTGG TOCJOGOGCTT CGATTACftTC 420 
45 TC3CRCXXS3CC GCOOGOGCCA GCACAAGICC OGGCTSCTCC TCCAASAGAT CCCTOCCATG 480 
TCGTAOaOAie GTQG0C3GCAC GTACAGCftAG G6C3SCCITOC AGCSkAGOOSC GCaAATTCTT 540 
CTTCaXtGCTA GAGAAMCTC AACAAAA6TT CXKmCTCA TGRCrGATGG ATATTCCAAT 600 
GGOGGAOAlGC CTAGIVCCAAT TGCAOCGXCA CTGOOtaGATT CftSCSAlGTGGA GATCTTCACT 660 
TTUQQCATAT GGC2AAGGGAA CAITOGAGAG CXGAATGAEA TGGCTTCCftC CCC3«AGGAG 720 
50 GftSCftCTGTT AOCTGCTACA CAGTTTTGAA GAATTT6JW3G CTTTAGCTOG COGGGCATTG 780 
CATGAAGATC TACCTTCTGG QnGTTTTATT CAAGRTfiAXA TGGTCCACTG CICATATCTT 840 
TOTGATGAAG GCAAQGACTG CTGTGAOCGA ATGCSGMySCT GCftAATGTOQ GAiChCACACA 900 
GGOaSTTTTG AGTGCATCTQ TGAAAAGQGCS TATTftCQQGA AAGOTCTGCA OTATGAA3GC 960 
^ ACAGCTTGOC CATCGGQGAC ATACMAACCT GAAGGCTCftC CfiGGAGGRAT CftGCftGTTGC 1020 
55 ATTCCATGTC CTGATGAAAA TCaCAOCrCT OCWXTQGAA QCACATOCOC TGAflGACTOT 1080 
GTCTGCAGAG AQGQATACAG OGCATCTGSC CAGAOCTOnS AACTTGICCA GTOGCCIGOC 1140 
CTGRAGCCTC COGAAAATOG TTACTTTATC CAAAACACTT GOUUCAAOCA CTT ^M GCA 1200 
aCCTGTGGGO TCCGATGTCA DCSCrrGGATTT GAICTTGTGG GRAGCAGCAT CATCTTATGT 1260 
CTACCCAATG GTTTGTOOTC CGGXTdAfiAG WSCTACTGCA GRj^rAASAAC ATGTCCTCAT 1320 
60 CTCCGCCAGC CGRAACR.TGQ CCACATCTGC TGTTCTAC3A GGGAAAT6TT ATATA flGRCA 1380 
ACA^ICTrrorOG TTGCCTOIGA TCSAAGGGTAC AQZbCtAGAAQ GCftGTGATAA GCTTACTTOT 1440 
CAftGOAAACA GOCAGTGGGA TGGGCCAGAA COGOGGTGTG TGGAQCGCCA CTSTIICCACC ISOO 
TTTCAGATOC CCAAAGATGT CATCATATOC CCOCACAACT GTGGCAAOCA GCCiU3GC:AAA 1560 
TTTGC5GACGA TCTGCTATGI AAGTTGOGGC CAAGGGXTCA TTTTATCTGG AGTCRAAGAA 1620 
65 ATGdXaGAT OTACCACTTC TfiGAAAATGG AATGTCGOaG TTCAGGCftGC TGTGTOXAAA 1680 
GAOTTGORGG CTCCTCSMT CAACTOTCCr AAGOACATAO AOGCXAAGAC TCZGQAACItG 1740 
OAGATTCTra C3CRATSTTAC CIGGCAGiATT CCAACAGCm AAGACAACTC TOGTGAAAAG 1800 
GTOTCTGTOC AOSTTCATCC AOCTTTCRCC CCACCITEACC TTTTOOCAAT TOGIMSATOTT 1860 
^ GCTATOGTAT ACROGGCAAC TORCCrTArOC GGCAACCAGG CX3«3CTOCAT MTCCATATC 1920 
70 AflGGTTArXQ ATGCAGAACC ACCTGTCATA GACTGOTGCA GATCTCSCTOC TCCOGTCCW? 19B0 
GTCTCBGAGA AGGTACAXGC OOCAAaCIGS GnXGAGOCTC AfiXTCTCAaA CMiCCCKSGG 2040 
GCTGAATTQG TCATTACCM AAGTCATACA CAAGGAGACC TTTTCaaCTCA AGGSGAGAGT 2100 
ATAOTACAGT ATACftGCCAC TORCX:CCICA GQCftATAACA QGACATOTGA lATCCATATT 2160 
GTCATAAAAG GTTCTCCCTG TGAAATTCXa^ TTCACACCIG TftAATGQGGA TTTTATATGC 2220 
75 ACTCCAGATA ATACTGGAGT CftACTGTACA TTAACXTGCT TGGAGGGCEA TGA1TTCACA 2280 
GARGGCTCEA CTGAiC^lAGTA mATTGTGCr TATGAAGAIG GOOTCXGGAA ACC3U\CATAT 2340 
ACCACZGAAT GGCCAGACTa TGCCAAAAAA OGTrrTGOUl AOCACGGGTT C3\aGTW xx.i 2400 
CSRGAlGTTCr ACaAAOCAGC TOGTTGfTQAT GACACAGATC TOATGARGAA OTTTTCTGAA 2460 
GCATITGAGA COACCCTGGG AAAAATGGTC CCATCATTTT GTAGTGAIGC AGAGGACATT 2520 
80 GRCTGCaOAC TGGAGQAGAA GCTGACCAAA AAATATTGCC TAGAATATAA TTATGACTAT 25B0 
GAAAATGGCT TTGCAATTGG AOC3U3GXGGC TGGBGTGCAG CTAATAGGCT GGBVTTACICT 2640 
TACBATGACI TCCTOQACAC TGTGCAAGAA ACAQOCACAA GCATOGGCAA □JGOCAAGTCC 2700 
TCACGGATTA AAAGAAGTtSC CCXATTATCT GIVCIATAAAA TTAAGTlAAT TTTCAACKrC 2760 
AjC3U3CrAGTG TGOCaTTAOC CGATGAAAGA AATGATACCC TTGAATGGGA AAATCSkGCAA 2620 
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CGACTCCTTC AGACATTGOA AACTATCACA AATAAACIG^ AAAGGACTCT CAACRAROAC 2880 
CCCATCTATT CCTTTCRGCT TGCATCAGRA ATACTTATfttS CCGAGAGCAA. TTCATTAGAA 2940 
AC3kAAAAAG6 CTTCCCXKTT CTGCAGRCCA GGCTCAGTGC TOAOnOCSGCG TATGTGTGTC 3000 
AATTCCOCTT mSGGAACCTA TTATAATCTG GAACATTTCA CCTGTGAAAO CTGCCQGRTC 3060 
GCSftTCCTATC AAGATtaAAGA AGGGCAACTT GRiSTGCAAGC TTTQCCCCTrC TGGGATGTAC 3120 
ACQGRATATA TCCATTCAAG AAACaTCTCT GATTGTAAAG CtC&STGTAA ACMGGCACC 3180 
TACICZa'ACA GTGGACTTGA aaCTTGTOAA TCGTGTCCRC TGGGCACTIA TCW30C3U\AA 3240 
TTTG6TTCX:C OGAfiCTGCCT CTOerOTCCA OAAAACAGCT CAACTGTGAA AAGflCGAGOC 3 3 DO 
6TGAACATTT CT6CATGTGQ AQTTCCTTGT CCAISAAGOAA AATTCTCQCX3 TrCTGGGTTA 3360 
ATGCCCTGTC ACCCATGTCC TGCTGACTAT TAOCAACCTA AT£3CAjQGGAA GGCCTTCTSC 3420 
CTGGCCTOTC CCTTTTATC3C3 AACTACCCCA TTCGCTGQTT OCAGATOCAT CACAGAATQT 3480 
TCAACTTCftG TTCTQAATAT TACTATTTTC GGTGOATTTG GOCATCIQGA OTTOTTAAAT 3540 
TGTCCTTCTQ AGGTTTTOCA TQAATGCTTC TTTAACCCTT GCCACAATAG TGGAACCTOC 3600 
CAOCAACITG G0OGTG(JrTA TGTTTQTCTC TGTCCACaTO GATATACftGG CTTAAAGIGT 3660 
GAAACAGACA TOGATGAGTO CAGCOCACTQ OCTTOCCTCA ACAATGGW3T TIGTAAABAC 3720 
CTAaTTQ«3G AATTCATTTG TGAGTCKJCCA TCAGGTTACA CAOGTCRSCG GTOTOAAGAA 3780 
AATATAAATG AGTGTAGCTC C3WBTCCTTGT TTAAATAARO GAATCTGTGT TGATGGTGT3 3840 
GC:TGGCIATC GrrOCaiCAIG TGT0AA»aGA TTTOTAGGGC TGCATSGTGA AACAGAASTC 3900 
AATGARTOCC AfiTCAAACCC ATGCTTAAAT AAlGCaGTCT GXGRftGACCA GGTTGGGGQA 3960 
TTCTTGTGCA AATGCCCACC TQGATTTTTG QGTACCCXSftT GTGGftAftOAA CGTCGATGA6 4020 
TGTCTCAGTC AGCCATGCAA AAATGGAGCT ACCTGTAAftG AOGGTGOCAA TAGCTXCfiGA 4080 
TGCCTQIGTG CAGCTQGCTT CACftGQATCA GRCTOTGAAT TGAACATCAA TGAATOTCM 4140 
TCIAMTOCAT GTAGAAATCA GSQCAXXTGO: GTGGATGAAT TAAATTCATA CRSTTGTAAA 4200 
TGITCASGCftG QATTTTCAGG CARAafiGTCT GAAAdftGAAC AGTCTAC^GG CTTTAACCTG 4260 
GATTTTQAAG TTTCTGGCAT CTATGGATAT GTCATGCTA0 ATOGGftTOCT CGCATCTCIC 4320 
CATGCTCTAA CCTGTACCTT CTacSATGAAA TCCTOTGACG ACATGAACTA TOQAiiVCACCA 4380 
ATCTOCTAIG CAGTTOATAA CGQCAQCaAC AATACCTlQC TCCTQACTGA TTATAaOOGC 4440 
TXSQGTTCTTT ATOTGAATOO CftaOGAftAAa ATAACAAACF QTCCCXCGGT GaAnGATGGC 4S00 
AQAIGGCATC ATATTQCRAT CRCTrGGACA AGTGOCAATG GCATCTOGAA AgTCTATATC 4560 
GATGGGAAAT TATCTCACGQ TGGTGCTOGC CrCTCTGTTG GTTTGCCCAT ACCTGGTCOT 4620 
GCnGOGTTAG TTCTC5GGGCA AGAOCAAGAC AAAAAAGGAG A«3GQAT1CAG CCCAGCTGAG 4680 
rCTTTTOTGQ GCTCCATAAO CCftGCTCAAC CXCTGG0ACT ATGTCCTGTC TCCACAGCRQ 4740 
GTOAAOTCAC TGGCTWMC CTGCCCAdGAG GAACTCaSTA AAOGAAAC3GX GTTAGC3WGG 4800 
CClGATrrCT TtGXCAGGAAT TGIGGGGAAA QTGAAGATCG ATTCTAAGAO CMrATTTTGT 4860 
TCTGATTGCC CAOSCTTAGG AOGGTCAGTG OCTCATCTGA GAACTGCATC TGAAGATTTA 4920 
AA6CCZ«3GTT OCAARGTCRA TCrGTTCTGT QATCCAOGCT TCCftGCTGOT GGGGAAOCCT 4980 
OTGISRGTACr GTrCTGAATCA AGGACAGTGG ACftCAACCAC TTOCTCRCTG TGAAOGCSWCT 5040 
AGGIGTOGGG TGCCACCTCC TT^OSAGAAT GGCTTCCATT CAGCOGATGA CTTCTATGCT SlOO 
GGCAiGCACAS TAACCTAOCA GTGOIRCAAT GGCTACCATG TATTQGGIGA CTCAAQQATG 5160 
TTCTGTACaG AlTAAaHSGGAS ClGGAACBGC QTTTCACX»T CCrGGCTTGA TGJQG ATGA O 5220 
TQTaCftGTTG GATCAGATTTG TAGT0AGCAT GCTTCTTGCC TGAAOOEAGA TGSATCCXAC 5280 
AXATCTTCAT GTGTGCCACC OTACACAGGA SATGGGAAAA ACIXSTGCAOA AOCTATAAAA 5340 
TOTAAOGCTC CAGQAAATCC GGAAAA3X3GC GACTCCTCAG GTGW3ATTTA TRCAaTAGGT 54 00 
GCOSGMTTCA CftTTTTOGTG TCMGAAGOA TACCAGTTGA TGGGACTAAC CAAAATCACA 5460 
TGTTTGQAGT CIGGAGAATO GAATCATCTTA ATAOCATATT GTAAROC TOT XTGATOTGOT 5520 
AAAOCGGCTA TTCXiAfiAAAA TGQTTGCATT GAGGAGTTAG CMTTACTTT TOqCftBG AAA 55B0 
OTGACATATA OGTGTAATAA AGGATATACT CTGGCC0GTG ATAAA9AATC ATOCTOTCTT S640 
GCTAACAGrr CTTGGA6TCA TTOCCCTOCT GMTGTGAAC C3W3TGAAGTG TTCTAGTOC3G 5700 
OAAAATATAA ATAATOQRAA AJTATATTTTG AGTGGGCTTA CCTACCXTTC TACTGCATCA 5760 
TATTCATOGa ATACAGGATA CRflCTTACM QQCCCITOCA TTA-ETGAATG CftflGGCTTCT 5820 
GQCATCTQGG ACAGAGCGCC AOCTOCCTGT CROCTOGTCT TCTOTGSAGA ACCAOCXCXX: 5880 
ATCAAAGATO CXGTCATTAC GGGGAATAAC TTCACTTTCA eGAftCACOCJT ChCTShCMJ^ S940 
TGCAAASAA6 GCTATACTCT TGCTGQTCTT GACACCATTG AATQCJCTCGGC CBAOOGCiAAG 6000 
TG(»i(3TIuaAA GTGACC3U3CA erGCCTOGCT GTCTCCTOTG A!PGAG CCAO C CATTGTGOAC 6060 
CACGCCTCTC CAOAGACTGC CCA!rOQGCTC TTTQGASACA TTGCaOTCTA Cta iJXtakJgCT 6120 
OATOGTTACA GCCrPBCPJSA CAATTCCCaifi CTTCTCTOCSi. ATGCOCAGGG <3W^|G9aSA 61B0 
OCCCCRGAAG OTCAAGACAT GC0CC3STTGT ATAGCTCATT TCTOTGAAAA ACCTCCA3GB 6240 
GTTTCCT&VA GGATCTTQCaA AICTGTQAQC AAAGCAAAAT *3CTGCAGCTQG CICAGTTGTa 6300 
AGCITTAAAT OCaVTCGAAGG CSTTGTACTO AACA£XrCCfiG C3kAAfiATTGA AIGTATGAGA 6360 
GGTGQBCaGT GGAAOCJCTXC OCCCATGTCC ATOCAGTGCA TOCC3X3T60G GTGTOOACaflG 6420 
CCACCAAGCA TCATGAATOG CTAK3CAAGT GOATChAACT AChGTTTTGG AGGCATOSTQ 6480 
(3CTTACA6CT QQMkCRMSGG GTTCTACATC AAAQ(^G2M AQAAGAfSCAC CTGOanAGCC 6540 
ACAGGGCAGT OGAGTM3TCC TATACCGACO TGCXaCCCGG JMCTTOTGO TGAACCACCT 6600 
AAG6ITGRGA ATOGCTTTCT GGAOCATACA ACTGGCAGGA TCTTlJGAfiAO 713AAfirGAGG 6660 
TATCAOTGIA ACCOGGQCTA TAAGTCA3TC GGAAOTCCTG TATT TOTCIG CCAAQOCAftJ 6720 
CX5CCACTGGC ACAGTGAATC C3CCTCTGATG TOTGXTOCXC TOGACTOraO AAAACCTCOC 6760 
COGATCXaGA ATOGCrrPCAl GAAAGGAQAA AACTTTGftAiS TAGGGTCCAA GGTTCAQTTV 6840 
TTCTGIAATG AGGGITATGA GCTTGTTGGT GACaSTTCTT OGACATGTCA GRAATCTGGC 6900 
AAATEGGAATA AGAAGTCAAA TOCaUVAGTGC: ATGCCTOCCA AGTGCOCAGA GOCX3CCCCTC 6960 
TTGGAAAACC AOCXAOTATT AAAGC3AfiTTG ACCACOGRGG XAlGGftiQmXST GACATTTTCC 7020 
TCTAAW3AAG GGCATGTCCT QCAAGGCCCC TCTGTOCTOA AATGCTTOOC AXCGCAGCAA. 7080 
TGGftATGACr CTTTCCCrGT TTGTAAGATT GTTCITTGTA CC<X»jCCTCC CCTAATTTOC 7140 
TTTQQIGICC CCA1TOCTTC TTCIGCTCTT CATTTTGOAA CTACTOTCAA GrATTCTTGT 7200 
GXAGGTOGGT TTTTCCTA2W3 AGGAAATTCT ACC3«XX:TCT GCCAACCtGA TGGCACCTGG 7260 
AQCTCXCCAC TGCCAGAATQ TQTTCCAGTA <3AATGTCCC3C AAOCTGAGGA AA1CCCCAAX 7320 
GGAATCATTG ATOTGCAAGG GCTTOCCTAT CTCAGCACZUS CTCTCSnTAC CTGC3WSGCA 7380 
GOCrrTGAAT TGGTOGGAAA TACTACCAOC CETTGTGGAa AA21ATGGTCA ClOGCITGGA 7440 
GGAAAACCAA CATQTAAAGC CATTGAGlGC CTGAAACOCA AGC3BftQATTTT GAMX3GCAAA 7500 
TTCICTTACA OGGAOCTACA CTATOGACAG AOOGTTRCCT ACTClTGCaA CCGaOGCTTT 7560 
COGCTCXavAG GTCCCAGTGC CTTOACCTGT TTAlGAGACAlS OTGAITQGGA TOXnORTGCC 7620 
CCATCTTGCA ATGCXA.'TGCA CTGTGATTCX: OCACAAlCGCA TTGAA AftTQ S TTITGSKBAA 7680 
GGTOCAGATT AC^AGCTATOQ TGCCATAATC ATCERCASTT gCTT OOC TGG UTTTCReGTG 7740 
GCIGGTCATG CCSUIGCAGAC CTGTGAAGAG XCAGBATGGT CAAGTTCCAT CCCfkRCKLSt 7800 
ATGCCAATAG ACTGTGGCCT CCXTCOCTGAT ATAGATTTTG GAGAC3GTAC TAAACXCAAA 7860 
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<3ATGACCAOS OATATTTTGA GCftAQAACanC GACKTQftTGG AAOTXCCAIA TGTGACTCCT 7920 
CaCCCTCCTT ATC3«rmK3G AGCftGTGGCT AJVAACCTGt3S AAAATftCAAA QQIiSTCrCCT 7980 
QCTACACMT CATCAAACTT TCTQTATGGT ACCRTGGrrTT CATACACCTG TAATCCAGGA 804O 
TATOiUVCTTC TGGGGAACCC TGTGCTGATC TGCCAGGAflQ ATGGftACXTG GAATOGCAQT 810 0 
5 QCAOCaTCCT GCATTTCAAT TGAATOTGftjC rrTGCCTACTG CTCCTGWWA TGGCTTTTTG 816 0 
CeTTTTACaa AGACf3U3CftT GGGftAGTGCT GTGCMIXKTA GGTGTIUUUCX: TGGACACATT 8220 
CTAGCAGGCT CTGACTTAAG GCTTTOTCEA CffiSAAIAfiAA AGTGGaGTQS TGCCTOCCCA 8280 
□aCTGTQAAG CXATTTX^A-TG CAAZUUkGCCA AATCCAGTCA TGAATGGATC CATCAAAGGA 8340 
AGCAACTACA CATACCTGAQ CACGTTGTAC TATQftGrGTG ADCCCGSATA TGFTGCKSAAT 8400 
10 GGCACTGAQA C3GAGAACAT6 CCW33ATGftC AAAAACTQQS ATGAtSGMOA GCKCATTTGC 8460 
AITCCrGTGG ACTGCAGTTC ACOOCCAGTC TCAKXSCMVS GCCBOOSCShG AGGASAGQAQ 8520 
TACftCATTCC AAAAASASAT TGAATAC2«?r TGCAATGAAO GGTTCTTGCT TOAGOGAGOC 8580 
AOQAQTCGOG TTTGTCTTGC caATGGAAST TOQAOrGGAG CCRCTCCOSA CTGTGTC30CT B640 
GTCAGATGTO CCACCC!C3GCC AiCaACTQGCX: AATGGQGTOA CGGAAGGOCT QOACTATGGC 9700 
15 TTCIVIGAAGG AAGTAACATT CCRCXGTCAC GAGGQCTACA TCrTOC3«3GG TGCTCCAAAA 8760 
CTCACCrarC AQTCAGAOIOS GAACTOGQIUr GCM3AGATTC CXCICKG/IOK JUCCAGlCAAC 8820 
TGTGGACCTC CTGAAGATCT TGC0CATG6T TTCCCTAAT6 CTTTTTOCTT TATTCRTQGQ 8880 
GQCCATATAC AGTATCAGTG CTTTCXTrGGI TATAASCTCX: ATGGAZUVTTC ATCAAGAAQG 8940 
TGCCTCTOCA ATGGCTCCTG GAJSlGGCAtSC TCa^CCITCCT GCCTOCCTtG CftGATGTTCC 9000 
20 ACACCWSTAA TTOAATATGG AZUiTEGTCnAT GGGACAGATT TTGAiCIGTGG AhAQQCAGOC SO 60 
CGSATTCAi^T GCTTCAAAGG CTTCAASCTC CmOOACTTT CTG2MTCS1C CTGIGftAGCC 9120 
GATGGCCTGT OGJVGCTCTGG GTTCXXXX3M2 TGTGAACACA CTTCrCGTGG TTCTCTTCCa &180 
ATQFLTACCRA ATGCGTTCAT CAGIGAGAOC JW3CTCT1GGA AQQAAAATG-T GATAACTTAC 9240 
AGCrGCaM3OT CXGGAIATGT CATACiVRGGC AGTTCAOATC TGA1TTGTAC ABAGftAAGGG 9300 
25 GTATQ6AGCC AGOCTTATCC AGTCTGTGAG DCCTTGTCCT OrGGGTCCCC ACCGTCTgTC 9360 
GGCAATQCAG T06CAACTG6 AGAOGCACIU: 2U3CIATBAhA GTG2UUSrOAA ACICAGATGT 9420 
CTGGAAGGTT ATACGAT6GA TACAGATACll GATACATTCA CCTOTCZU^A AGATGQTOGC 9480 
TGQTTCCCXG AGAGAATCTC CTGdaGTCCT AAAAAATCGTC CTCTCCCGGA AAACATAACA 9540 
CRTATACTTG TA{:!ATrG(3GGA OGATTTC3OT GTGAATAGGC AAGTTTCIGT GTCRTSTGCa 9600 
30 GAAfiGGTATA CCTTTGAGGG A/STTAACKTA TCAOa^ATGrC 2U5CTTGATGG AACCIGGGAG 9660 
CCACCATTCr CC^^CGAATC TXQCMJKCH gJri ' iVA"imTa GGftAftCCTGA AAfilXXSlGAA 9720 
CATCGATTTG TGGTTGGCaa TAAATACROC TTTGAAAGCA CAATTATTXA TCnGTGTGAQ 9780 
CCTOGCTATG AACTAGAGGG GAACAGGGZU^ CGTGTCTGCC AGGAGAACAG ACW3TGQAGT 9840 
GQACaOOGTGG C2ATATGCAA AGAGACCAGG TGTQBAACTC CACTTGAATT TCTCAATGGG 9900 
35 AAAGCTGACA TTBAAAACAG GAOGACTGGA CCCAAOGTGQ TATATICCTG CAACAGAQGC 9960 
TACAGTCTT6 AAGGGCCSVTC TSASGCftCAC TOO tfaflftftA ATGQMOCTG GftGGCAOO^ 10020 
GTCCCTCTCT GCAAACCAAA TCXATGCCGT GTTCCTmO TQKTTOOOGA GAATg CTCTG 10080 
CTGTCIGAAA AGGAOTTTTA TCllTGATCAG AATGTGTCCA TCAAATGTAG GGAAGGTTTT IDIAO 
CTGCTGCAGG QCCAOGGGAT CATTACCTGC AACCCCGAOG AaAOGTGGAC ACAQAmAGC 10200 
40 GCX»AATGTG AT^AAAATCTC ATGTG6ICCA OCAOCTCAOG TAGAAARTGC AATTG CTOOA 102£0 
GGOGIACATT ATCAAl^AXGG AGACATGATC AOCIACICAT QTTACaGTGG AnOUCerEG 10320 
GaSGGTTTGC TGAGGAGTOT TTGiraUSAA AATGGAACAT GGACATCAiCC TCCTATTTGC 10380 
AGAGCTGTCT OTCSATTTCX: ATCTCAGAAT GGQQSCATCT GCCAAOGCOC AAATGOTTGT 10440 
TCCTGTOCftG AGGGCTGGAT GQGGOGOCTC TGTGAAGAAC CAATCTOaiT TCTTCOCTGT 3.0500 
45 CraZ^TkOOGAG GTOGCTGTGT OQCCGCTTAC CAGTGTGACT GCCCQCCKCGG GTGGAOGGGQ 10£60 
TCTOOCTGTC ATACAGCTGT TTQCeaSrCT CCCIGCTTAA AIGGIGGAAA ATGTGaAAGA 10620 
CXSVAACCX^T GTCACrOTCT TTCITCTTGa AlQGGGAGATA ACTQITCCAlQ GTAA 

Seq XD WO: 430 PgOtein BQQUence 
50 Pronein. Acceeslon #e FGEHBfiH predicted 

1 11 21 31 41 SI 

1 1 1 I f I 

MNPKLAFCCW 6LAOVSGHAT PQQHSPSRIIIF ^FKLVPSTAP GAP9SIPAFP APGDEAAGSR 
55 VERLGQAFBR RVRUiRBLSB RIaEUVPLVDD SSSVGEVNPa. SEU/iFVSKlih 9DFPWPTAT 
KVAIVTFSSK N7WRRVDYX eTfEBAROBKC ALU.QBIPAI SYRGBOTleTK GAFQQAAQIL 
UlARCHSTKV VFEillXJGySrr GGDFBPZAAS IiEtDSGVEXPT FGXWQGtVIRE TMCHAeVPSB 
EHCYLLHSPE EPEAXARBAI. &£X)I£9GSFI QPDHVHCSYIi CDEGKDOCDR MGSCKCGTHT 
QHFBCICEKa YYGKGIiCJYEC TACPflGTYKP BGSPGGiafiC IPCPDEHHTS PPGSTSPEDC 
60 VCRGGYRASG QTCELVHCPA LKPPENGYFZ OMTOgilRFHA ACaVBCHEGF DPTO3SI1LC 
XiI«GLnSG&& SyCRVBTCFK JOIQBXBGBLS CSISEHmT TCLVMS^GT 1 >TTO!gPK T *TC 
QGNSQHDGPB PRCVERHC8T FQHPKZIVlXS ESNCGKQPKK FSTZCYVSCR QGFII.GGVKS 
MLKCXTSGKR NVGVQAAVCEC DVEAPQIHCP KDlfiAKTLEQ QDSANVTHQI PTARXNSGEK 
VSVHVHPAPT PPYLPPIGDV AIVYTATDLS GPIQAaCIFHl KVIDABPEVI DHCRSPPPVQ 
oS VSBKVHAA3W PEPQFSDHSG AELVIIRSET QCaDLFPQGBT IVgiYTAtDPS GMMRTC DIHI 
VIKaSPCBXP FTFVNGDPXC TEOSnmaCX IrTCLBSnXPT EGSTDlO nroA ZEEXnmKPTY 
TTEHroCAKK XFANUGFEC&F fiKPSTKAARCD DXOLHXKFSB ATBTTtiGKHV PSFCSEtAEDI 
DCRL^BHLTK KYCUEryNyDT ENGFAIGFGG HGAA£IRLDYS YDDFLiDTVQE TATGIGBIAKS 
fSOKaGAPLS IiyKZKLZPHI TA8VPLPi:»ER BDTXjEWSHQQ RIiXATXiBnT NKLXGITLNKD 
70 PKySFOLASB IIiIADSNSIiB 7XKASPFCBP GSVLRGBMCV HGPlATYYlil* EBFTCBSCSt 
6SYQDEBSQI) FCKLCPeGaa TEVXHSRinS PCRSOCKOGT TSYSSLBXCB eGPICTVOPK 
FGSRSCXiSCP imrrSTVKRGA -VHISAOGVPC PBSKFSRSSb MFCHPCPRSY YQI Wftflg APC 
L3^PFYGTTP FAGSASITBC STSVt^ITIF GQFGHLSLLN CPSErVFHBCF ENPuttwg gTC 
— - QQDQaOYVCL. CPLGXTGUKC ETOIDECgPL PCUOTGVCKD LVGBPICECP SOYTGQRCBE 

75 NI»BCS88PC [JIKGICVDGV AOYKCTCVKG FVGXaCBTEV KEOQg HPCU t NAVCEDQVGG 
FIiCKCPFGFIi GTRCGKEIVDB CLSQPCKNGIV «:XDG«ilSPR CZiCRASFTGS RC^JmiBOQ 
SISIPCBZIQATC GQP^fiGKRC KTBQSIGEVL DFE79aZTG3r VMIJ3GH[fP9I> 

HALTCXFNMK SSDDMWYGTP ISYAVEnOSD UnJjMDTfHO WVIiYWIGRFK ITHCPSVNOG 
RHHHIATTWT SANGIWKVYX DSKLeDGGAG liSVGLPIPGG GAbVLGQEOD KKGKGPSPAE 
80 SFWGSlBdtM IiUDiyVLSPQQ VKSEiAZGCPB ELSKGiiVIiASI PDPL86rVOK VKXD8K8ZFC 
SDCFRZiGGSV PHLBTASBDEi KPGSKVKLFC DPGFQZi'TOirP HQKXBQGQtf TQPUSBCERI 
SGGVFFPZ^ GFHSAQDFICA GSTV^SQCEnr GYYIiLG D SHM FC!i'l3lilGS9QIG VfiP&CIiDVDB 
CKVGSDC8EH ASCUIVZ>asy ICSGVPPVTG DGKHCABPZK CmPGNPEHG HSSOBIYTVG 
A£3VTPSGQE6 YQpLMGVTKIT CUBB UKW WI U i IFYCKKV8GS XEftlPSnOCI BBLAFTFQSK 
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10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



YSCDiGYSJiQ 

SFKCKBOFVIj 
AYSCUKGPYI 
YQCNPOYKSV 
FI^IEOYELVG 
CKEGHVLQGP 

&FS1A/GSTTT 
RLBGPSMiTC 
AGHAMQTCEE 
HPPYELOAVA 
AP&CX6ZECD 
RCEM9CKKP 
IPVDCSSPPV 

CGPPBDLAHG 
rPVIEYGTVW 
MXFNA?ISEr 
AKAVKTGEAH 
EXIMaGDDFS 



KADIBHRTTS 
liSEKEFYVDQ 
GVHyQYGDMI 



LAGDKE&SCL 

DTIECLADGK 
LIiCNAQOKRV 
MTSAKIECMR 
KGEKKSTCBA 
GSPVFVOQAN 

SVItKCLFSQQ 
TTLOQPDGTW 

IiKTGDWDVDA 
SGH5SSIFTC 
KTHElilTK^SP 
1>PIA?ENGFL 
lYFVMNGSXKa 

NGVTEGUIYG 
FPUGF6FIHG 
GTDFDG6KAA 
SSHKEaJVITY 
TYESEVKLRC 
VNRQVSVSCA 
PESTIIYQCB 
FlOWYfiCtilRG 
HVSIKCRBQF 
TYSCYSGYMIi 
CBEPICILFC 



ANSSnsUSFF 
QIWDRAPPAC 
WSK&DQQCLA 
PPEQQDMPRC 
GOOHNPSPHS 
TGQWSSPIPT 

KHIJKKSHPKC 
WNDSFPVCKI 
SSPLPECVPV 
GKPTCKAIEC 
PSCNAIHCDS 
HPIDCXaijPPH 
ATHSSNFIjYQ 
SFXETSMGSA 
SNYTY1»STLY 
YTPQKEIJSYT 
FHKEVTFHCH 
GRIQYQCFPG 
RIQCFKGFKIi 
SCRSGYVIQG 
LSGYTHDTDT 
EGVTFEGVNI 
FGYELE<3NRS 
YSIiEGPSEAH 
LLQGHOriTC 

LK6GRCVAPY 



VCBFVKCSSP 
HLVFCGBPPA. 
VSCDBPPTVD 
lAHFCEKPPS 
IQCIFVRCGB 
dOPVSOGEPP 
CVPUDCGKPP 
MPAKCPSPPL 
VtiCTPPPLIS 
ECPQPBEXFH 
ItKPKEIljItGK 
PQPIEHGFVB 
XDPGDCTKLK 
\mVSYTCHPQ 
VUTSCKPGHI 
YECDPGYVLN 
CHEGHFIiLEGA 
EGYILHGAPK 
YKbHGtrSSRR 
liOEifiEITCEA 
SSDI>ICTEKG 
DTPTOOKDGR 
SVCQIiD(3TWE 
RVCQENRQWS 
CTENGXWGHP 
NFDBTWTQTS 
BGTHTSPPIC 
QCDCFFGHT6 



TKDAVITGaW 
HASPEXAHRIi 
VSYSHiESVS 
PPSIHBI6YA9 
KVEHOFLBUT 
PIQI9GFMKOE 
IfEIQLVIiIGEZi 
FGVPIPGSMi 
GIZDVQGIAY 
F9VTDUIYGQ 
6ADYSYGATX 
DDGQYFBQBD 
YELLGNPVIiI 

GTERRTOQDD 
RSRVCUU9GS 
Z.TOQSDGNWD 
CbSSOS»BGS 
DGQWS&6PFH 
VHSQPYPVCB 
HFFBRISC9P 
PPFSDBSCSP 
GGVAICKBTR 
VPLCaCPNPCP 
AKCBKZSCGP 
RAVCRFPOQK 
SRCRTAVGQS 



SGl/CTLSTAS 
PTPRKTVTYT 
FGEDIAFYYCS 
KAKFAA6SW 
GSHYSFGAMV 
TGRIFESEVa 
HFEVGSKVQF 
TTEVGWTFS 
HFGSTVKY&C 
liSTALYTCKP 
TVTYSCasnRGF 
IY8CFPGFQV 
DMHEVPYVTP 
CQEDGXnSrGS 
EHBKffSGASP 



WSGATPDCVF 
AEIPIiCKPVN 
8P9CLPCR<:!S 



RbSCXSSPPSV 
KlCQSIfPENIT 
VSCGKPESPS 
CBTPLEFWIG 
VFFVIPSMAL 
PAEIVEKAXAR 
GQXCQRPKAC 
PCLNGOHCVR 



1920 

1960 

2040 

3loa 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

25B0 

2640 

2700 

2760 

2B20 

2880 

2940 

3000 

3060 

3120 

31B0 

3240 

3300 

3360 

3420 

3480 

3540 



Se5 ID HO J 431 DMA oecnience 

nucleic Acid AdceBslon fti FGEtTBSB predicted 

Coding Beguences I.. 390 



1 

\ 

ATGAGGTTCA 
TATGOXSTCAG 

gaaccatqgc 
cagtgctgtt 

TGCRCCTTCT 

TrrrGTTSTGA 

AGTAAATGTG 



11 
1 

GTGTCTCAGQ 
TCTOTCrOCT 
TGTGCCAGCC 
AC3UITGA)GGC 
GGCCCTGCrr 
A0CrGAAGGT 



21 
I 

CATGAGGACC 
CCTCTTGTGT 
GGCACCCW30 
CATCQTGT CC 
!IGAGCTCT0C 
TCIW9QGTGT6 
GATAT6TTAC3 



31 
1 

GACTACCCCA 
OCAACGGAAG 
•XGTGGSiOACA 
CTGAGC3GAG21 
IGTCTTGAOT 
AftTTGCCAGT 



41 
\ 

GGAGTCTGCT 
TCATOGCTOC 
AG&TCTACaUl 
CCOQCCAATG 
CCTTTGGCCr 
OOCACTCKTC 



51 
I 

QGCrOCTGCT 
OGCTOGCXCA 
CCCCTTGGAG 
TGGTOOCCCC 
CACAAAOGKT 



60 
120 
180 
240 
300 
360 



Sag ZD 210s 432 Protein aMruence 
FrpteiiL Aaceaalam F6EKIESB predicted 

1 II 21 31 41 51 

1 I I I ) i 

MRPSVS(34RT DYPRSVIjAPA YVSVCEiLUEiC FREVIAPAGS EPHI*C!QP-AER CGPKXYSPIiG 
QOCYHDAIVfi liSETBQOOPP CTPWPCFBLC CLDSFGLTHD FV^nOiKVOOV ISSQCHSSPIS 

SaCiCERGRIC 

Seq ID NO: 433 IffiWV ecguence 

lluclelc Acid AcceBfiioiL iaH_0D7231.1 

Coding aequAfice? B9..2D17 



1 
I 

TAGGAACAGG 
G2U3GCGGAGC 
GTGCA6GGAG 
TGAGAATCAG 
ATAOSCnGrG 
AGGCGOCTTC 
TCTOGAGTGT 

TCCATTcrerr 

TTACAKXGTC 

aocatggaaa 
tcajctgtaat 
ctgggtju3ac 

GCTTCCCAGT 
GACTGGAOTA 
AGCAGCftCTA 
GGOdATOTG 
AQQCATTTCA 
ATGGAAAOAT 
TOCTCIATCA 
TTTGftCAAAC 
CAIOQGCCAT 
ATTCftTTGGC 
ATTTTTT^TTG 
CACAACAACA 
TTTGGGCTGC 



11 
1 

GGAGAGTGCA 
CAI3CC7GAGGG 
AAGGMSAAAG 
GACCGTGGTA 
GGATTAGGAA 
TXGATAOCTT 
TCACmSGGAC 
CAA0GTQTG3 
ATAATTGCCT 
AATTGTTCTT 
GTGAGXACAG 
ATCAACSU^TT 
GAACAATAOT 
ATTGTTTGGT 
TTXAAAQOAA 
GTCCTACrCA 
TACXATATTG 
GCT60CRCTC 
TCXTACRATA 
TGTCTCACTA 
ATATCTGGAA 
TA^^QCABAQQ 
ATGCXITTAA 
AnCAAOATT 
TGCTTGCTTT 




AjCTGGTCCAA 
ATGXOIGGAG 
ATGCAftTTAT 
AATTTGCTAG 

ATAOTCTTTA 
0C3IG6TCAQA 
TGAATAAAG6 
TRACCIGCAT 
GGAATAAAGT 
A3TTAGCACT 
TCAAATC33TC 
TCCTGITAGT 
GAGOCCftSTC 
AGATAITTTA 
AGTTCAAAAA 
GCGTGTTTGC 
AG6AAGITTC 
CTCXhGOCCA 

■EKTTTGCCAA 
TGnTCTCCT 



31 
1 

G37CAAGCTCA 
GGACAA6TTG 
ATCAGAGAAT 
AAAAXGGGAT 
ATTTOCATAT 
GXraGCATTG 

CTTAfaarccA 

G6TCCI6ATC 
CTACATGTTr 
TAAAAACTCT 
AATACAAOAG 
GAACGGGAGT 
GG05CTCCAA 
TTGTCrrCTT 
TX3GCAAGGTG 
ACORGGtGCA 



CT0C3C1TTCA 
CAACTGCTTC 
TQQATTTGCT 
TC3UGTT6XA 
ACXCOOUOGT 
C3(3ATTCTC»S 
AGTORIGTUUS 

iGGTcro^rc 



41 

1 

GCCIAGACIGC 
AAA!X0CCC9GA 
TTCCATGTTG 
TATCTTCTAT 
CTGAOCTACR 
GCrfGGTTTAC 

TCCRTTTTTQ 
GCTTCTTTTC 
AGCAGimiAC 
ATCATCCAAA 
GAAATTl^ATC 
GGGTCAAOTQ 
CTGQCrrTGQC 
GTATATTTTA 
ACTCTGGAGG 
AAACITAAGG 
GTGGCTTGG6 
TCTGATGCCA 
ATTTTTTCTA 
AAATdUSOTT 
GSTCCATTTT 
TTTGCTTCQA 
AAAATGAGGG 
TQTGTOACIC 



51 
I 

AAOAGGAGGC 
tilTTC'i"l'CAA 
GXGAAAKXGA 
CTATGATTGG 
GCAATOTTGG 
CSTTGXrCTT 
GGlbSGATTCT 
TQAlCS\ATCIA 
AAA6TGAAOI 
CAATAQTAAC 
^IGAATAAAAG 
AGCXWIGGCA 
GAATCAATGA 
TCATA<?rT06 
CAGCTCTTTT 
GTGCTTChAA 
AA8CTGA0GT 
GTGGCTTAGT 

TATTGGGACA 
TTGATTTGGC 
GQICCKTATT 
TCGftAftOBAI 
TTCCCA3AAC 
AGGCTOQAAI 



60 
120 



60 
120 
180 
240 
300 
360 
420 
4B0 
540 
60Q 
660 
720 
780 
840 
900 
560 
1020 
1080 
1140 

laoo 

1260 
1320 
1380 
1440 
1500 
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TTftCl^GGOTT CATCTOATTG ACC3U2TTCTG TGCTQGATOG GGCATTTTAA TTGCRGCTAT X560 
3K:TGG3«5CTA GTTQGAATCA TCTGGATTTA 1GGAGGC3AAC AGATTCATTG ACGATACAGA 1620 
AATGkTGATT GGAGCAAAGA GGTOGATATT CTOSCTATGO TGGaGflGCarT GCTGGTTlTGT 16B0 
A2UCTAC5GCCT ATCCTTTTBA TrTGCRATATT TATCTGGTCA rTTGOTGCART TTCATAGACC 1740 
5 TAATTAIGGC GCRATTCCAT ACCCTGACTG l3C3QAfiTTGCT TTAfSGCrGGT GTATQATTGT 1800 
TTTCTOC3WT ArTTGGATAC CAATTATGGC TATCATAAAA ATAATTCAQ3 CTAAAGGAAA 1B60 
CATCTTTCAA CGCCTTATAA <3TTGCTGCRG ACCaGCTTCT AACTGGGQTC CATACCTGGA 192D 
ACAACATCGT GGGGAAAGAT ATAAASACAT <3GTAGA.TOCT A2VAAMSAGG CTQACCa^TCA 1380 
ARTACCTACr GTTAGTGGCA GCAGAAAACC GGAATQAiGAT CTCATTGAAA MMaXCATG 2040 
10 ATTGTATAAT GTOAimTTT TTAGAATAGG GGGAACCTTA TTTATTTOTG TGTTAACTGA 2100 
ATAGQAAAAT GTACATACTA TGTTCATOAT AGTGTGRm TTTTCACATT TAAGCAGGAA 2160 
TGCRAXATAA AAATGTGAAT CTCrTTAASTC TCAGOCATOT GCTTATTATA TITCTTTTTA 2220 
GATTOTCTAT CTGTATAACR CACACACACA CACCTAABAG TCTCrATXTC ACA ATTATAT 22 BO 
TTTIGTAAAT AOTATATGCA TTTTTAATAC ATTGGIMSGCT TTAITTTOAA CTAATTTCTT 2340 
15 AGAGAATAGT TATATTTTCT ATTAC31CRM3 TTTAAAAATA TTATTAACTT GTATTTTCTT 2400 
AATATlkCAAT CTATCTTTTC CACAAATATG AOTQQGAAAT AAATCW3CAC ATTTGAAAGA 2460 
AAGTOTTAAA ACTGAAGOCC TCACTTAATT AGAAACQTGA TAAATATATG GACS^TGQA 2S20 
CTATACATAC TATAftGAGGA CTGTAGTTTA ATACTTTTTA CCCKMXVXG TTTAAAAAGA 2580 
TOGTQCATTT GTTACAGGTC ATGTTTTCTA TATGAACTTA GTCATTAATG TTCTTTMAA 2640 
20 AAAOKSftAAT AAOAnGSAAA AATTAOGATC CTACRGC5CAG TACGTGATAA ATCTAGAAAA 2700 
ITGaSTTTTO AGTACCrCTT T-TCOCATATA CZiATCTTCXir TCCTTAGGTA ATTTGGAAGA 2760 
AAACTATGAC CCATTTAATT TCTATTGTGT TTCACCAAAT TCa GTGTTG T TCATTATACC 2620 
TCTCTGAAAT ATAGGTTTAA TTTCRAATAG AATATOGACT TAAATGMAA TOBfiAAACTQ 2880 
GCTTTAATCA ATTCTAGCAT TOTATTACTO TAATACAGGG CTGATAGAGT GRTTTTGTCT 2940 
25 mTATGWSTC AGTTACTACr TACAOGTSAT AACTTOCATA CTATTGGAAG ATAAAGTTGT 3000 
CSAACTTOTC AAtS&ATGAGA AAAGCCAAAT TAGftAAATCC TATCTCKTAG TTTCCTTACC 3O60 
AAOQAIAATT AAATATATCA CTAAQflGCTT TATATATTGA TTATATATTO T TGACAAC TG 3120 
6TTTAAGCBT CATAGCCTAT GATGATAAAC ACrGCCEAEA TATGTAAATA GCTTTTraTC 3180 
AATTCITAAA TTTCTTAACC TAGQCTTCAG OQAGCATATG AAACCRAAAT TATATOGAAC 3240 
30 ATTTTCreiG TGTACATOTA CATGCATTTT TCTAGGOAGA GAGTCCGTAG OTTTATOMSA 3300 
ATAICARGGA AftACIGTGAC CCAAAGAftGT TTAAGAATCA CATACAGTGC TGCTGGCTrT 3360 
TTGTOCTTGG CAAATOAGTG ACAATAOAAG AAATAATTTT TCTTAC!hCAT TTlAAAftOaT 3420 
TTICTCTTCG TTOTGATTGA AOATGAAAGG ASTAAGAAAT TAAGGCATTT GTTTAAXTTA 3480 
TACTGGTAAC TTATTTAaQG GGGAGGGQAC ATGAAJaOTAG GTAAATA0GT SOaOCTCTAA 3S40 
35 TTSAAOCMC TCTCTAAGTT ATQTACGTAT ATATAJWSCTG AAATT6TGTT TGACATTCTG 3S00 
AGGGTTTTCr l"l"ri\.Tri"i - i ' CCTTnTTTT TTTTTTTGGX GOGGQeCTGG QGQTCAGftGT 3660 
CTTGrrCTGT TOCCtGGGCT QGAGTGCaCT GQCATGATCT CfeeCICRCie CAAOCTCTGC 3720 
CTTCTQQATT CAAGTGATTC TCCTGCCTCSk GCCTCTTGAS TAOCrGGGAC TACAGGIGOC 3780 
CGCCRCCACA OCftSCTAATT TrTGTATTTT TAGXAGWSGC QftAGTTTCOC CaTGTTOGOC 3840 
40 AGGCTOGTCT TCAACTCCCG ACCTCaMTG ATCTGTCaAC CTDGGCCTCC TAAAGTGCTO 3900 
AiGATXACAGO TQTGAGOCAC COlGCCXKGC C3CRTTCTAAG eGTTTTCTTT GAAGACAGGT 3960 
CRAATGCrGT TAGTAAOTTTT CAGGAQATTG TlftATICC2TC AiSTTATADCA GATTIXMAA 4020 
AATATTTGAG AATAQATGGC TAACMVGAGG TTAGJaRTAC TTTTOCTTAA TTTTAATCCA 4080 
CftGTAlGTlA CATCCATTCT ACCRCTACAT: TTTGOTGCirA TTTAAGGXGT OCAATTTTCT 4140 
45 ATftaeiGRCT TTTGCAATTC AGGQAAGRTT TGGQCATATT AAAIGAAAGA AlATCTAATr 4200 
OOGGGAOOTG TCGRAGGGAAA GAAATTCTTT TCAAAAGCTO ACCACAAASA GTASTTAAAA 4260 
GTTTTTGTGA CTATCTrCAC AAJffTGTGXAA AiSCACftfiATT TCAACAGA0I GCTTGGCATA 4320 
TTGTAOQCSTG CTCAATGCJTQ 6TTTTTATTA TTKTTACTCA GATTCCACAG TGGCAAGAAA 4380 
CA^rCATTCTA CATAATGGftA AACaTTTACA TC3UATOOCA CTTACTTTAA TOGSAACTTG 4440 
50 GAHATAATTT ATQGTATESOr KSTOTMACC ATTAATGAAA ACTTTTTCftC WJTTtaGTGA 4S00 
AATTAAAATC ACTATATCTC 



55 



Seq ID KOi 434 Protein eecniance 
PxQteiA AcceeBxem ftt HP DO 91 62.1 



X 11 21 31 41 51 

HDKLKiCPSFF KCHBKBSEVSA SSEMFHVGBN UBHODHGNWS KKSDVLIi6HI OXAVGLBWVW 60 

RFPYLTYSMG GGAFLIPXAI MLALAGtRLF yLECSIjGQPA SLGBVEWMRI 1.P&PQI3VSIT 120 

60 HVIrlSlFVTI YTOVIIAYSIi YYMEASPQSE I^snaTCSSHS DKMCSRSPIV rrHCNVSTWMK IBO 

GIQBIIQMBIK CWVDINHFTC UroSBJTQPG QLPSBQYMHK VAIiQRBSGMN BTOV^IVKYl^A 240 

liCLUAHIiIV GAALFI03JKS SGKWYETAL PPYWUiIU. VHGATLBBAS KGISYVIGRQ 300 

SWPTKLKEAE VHKDAATQIF YSliSVBHGGL VALSSlfSKFK HWCPfitlAIVV CLTMCLTSWF 360 

AGFAIP^XIiG HMAHISGKBV SQWKBGEDI. AFIAXPBAIA QLFGGPFW8I LFFFHtitiTLG 420 

65 UJSQ^AfilET ITTTIQDLFP JWRKKHSVPI TLGCCLVLFL IiQliVCVTOftO IWVHWDHF 480 

CaGWaiI.IAA ILELVGIIWI YGGHRFIEDT EKMIGAKSHI FWLWWHACHP VTTPHJjIAI S40 

PIWSIWQSHR PKYQAlPyPD HQVALGWCMI VPCIIWIPIM AIIXIIQAKG NlFQRIiJSCC 600 
ItPASHWOPXIk IBQSaaSBSXKD HVP3»KKBADK SIPTVSGSRK PB 

70 Seqt ID HOt 43S DNA sequgjioe 

Nucleic Acid AcceBslon frs H18728.1 
Ooding sequence: 51*. 1085 

1 11 21 31 41 SI 

75 I I I I I I 

GeRGCTCAAG CTCCTCTACA AAGAI3GTQGA CAGRGAAGAC AfSCAGAGACC ATGGQACCXX: 60 

CCTCRGCXXX: TCCCTGCAGA TTGCATSTCC CCTGOAAOGA GCSTOCTGCTC ACAGCCTCAC 120 

TTCTAACCTT CTGGAACCCA CCCAlCX»CTG CXAAGCTCAC TATTGAATCC AO GCCflJ TCA IBO 

ATGXCGC3W3A GGGOAAGGftG GTTCTTCTAC T0aCCC3«M CCTQCSCOCfta AATOGTATTO 240 

80 GTTACAGCTG GTACAAAGGC GAAAQftSTGS ATGGCAACAO TCTRATTGTCA GGATATGO'AA 300 

TAGGAACTCA ACAAGCTACC CCftOGQCOCG CATACASTGG TOSAGAOACA ATATAO^CCA 360 

ATQCATCCCT GCTGATCOVa AAOGTCACCK AGAATGACAC AGGATTTCTAT ACCXTTACAftG 420 

TCAXAAAGTC AOATCTTGTG AATSAA3AA6 CAAC3CGGACA GTTOCASaSA TACCCGGAGC 480 

TGCCCRMSCC CTCX3«CTCC AflCRAChACX OCaVI«XX3CGr GOWSGACRaO GATGCTGTOG 540 
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5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



GCC3TC!AA3VAG 
ACCGCAGTGA 

ACCCACCrCGC 
TCTTTATCOC 
CAQCC?VCTG(5 
TCCPCTCAGC 

GAATTCTTCT 

ACCCTCftSGC 
GCAAAOCAT6 

TGOCXCTTTC 

AAATGTACAG 

GCrTGAG&CTA 

CTCXAAA2U3C 

TCT^ACCTAS 
GTTAAGGAAO 



CTATCACTGT 
TAGCTCTATA 



TGhACCTGAG 

cagtccc2u9c? 
gaacgrtgcr 
cx:cagtc:acc 
caattaccgt 

ACAfitCACTCT 
CRACATCACr 
CCTCAATAGG 
TGTGGCCACC 
CTQGTGIATT 

BAAGCCCTftT 
CTGAGGTGTB 
GTQAG2^AATT 
CTCATCATGA 
□GTTGGCAG6 
ACAGZUJTGTC 
OTtSCACCCWB 
TGGTCCTTTT 
CCCAGCCATG 
GTGCAGTTTC 
ASTT&XAGAA 
CTTTATTCTA 
TACCCTCCTA 
TTTAAAT6TC 
ACAAAACTCA 
C&AAT(36ieG 
GTGAGOGCAT 
AAGATAGATC 
TTCCAOTCTA 

agxaaccxga 
ttccaatttg 

ACTTGl^UGMS 
ACT 



GTTCAGAACA 
CTGCAiBCrGT 
GGATCCTATG 
CTGAATQTCC 
OC3»30G6AAA 
TG6TTTATCA 
GTQAATAATA 
ACCACAGTCA 
GTCGGCATCa 
TTGQAIATTT 
ATCOCATTTT 
ATC3CZTGGASA 
TGCCACTCAS 
GACGACTTCA 
TAAGGCICTT 
ATQKTGCi'GT 
AGATGTATCT 
TQACIGAiCAT 
CAGAGTTGGA 
CAAT3CCAAA 
TGACTUCTTGT 
ATTAAOUIAT 
TTTTAGTTOQ 
ATAGTCATAC 
TGCATGCAGC 

onaaAATCTG 

TAACTGiATAA 
TOTUSCCAfiTG 
CAATTAAAAA 
CTTGAaTTAC? 
ACTAATCarCA 
ACAAAACCCA 



CAACCTACCT 
COUiTGCCAA 
AATGTGAAA1* 
XCTATQGCCC 
ATCTGAACCT 
ATGG6ACGTT 
GOGOATCCTA 
CGATOATCAC 
CGATTGGAGT 
CAGGAAt^AiCT 
ATOCCATGGA 
TGGACAACTC 
AGACTTCACC 
CACTATOQAC 
ACCCCCTTTT 
CATi'AQTATT 
TGTCAATCOC 
TAGCAGCATC 
CTTCTAGACT 
TAATAGAATT 
TQTIGAACAT 
GTGCTQCTT6 
TTTGTATCTT 
TAGTAQTCRT 
CAGCCATCAA 
TCATCAfiGAG 
TA13CACTAAT 
OTGCTTAAATG 

aaattaaaajc 
ca-taatacag 
tgttaac2c3a 

CrGlTCTTQT 
TTAATTOVTA 



QTGGTGfSGTA 
CATGACCCTC 
ACAGAACCCA 
AQATGTCCXX: 
CTCCTGCCAC 
GCAGCAATCC 
TAIGTGCCAA 
AGTCTCirGGA 
QCTQGCCAGQ 
GGCAGATTG6 
ACCACIAAAA 
AA7GAAAATT 
TAACTAGAGA 
AGCTTTTCCC 
AATTTGTCCT 
TCACAAGAAfi 
AALIaTiTTAC 
TTTAACAOUa 
CA£3eTGrTCT 
GCTGOCTAOC 
GGCTAAATAC 
GTTAAAATaa 
GCCTAAGGTG 
ACTCOCTGGT 
ATAOTGAATG 
AACKTOLTAA 
GCTTtAAGAT 
CTACAT2UZrrC 
CAATTTAAAA 
AA(3TOOCCIC 
TGTATTrATT 



AATOGTCAGA 
ACTCTACTCA 
GOGAGTGCXS^ 
ACCAITTCXX; 
GC3VGCCTCTA 
ACAOUVGAGC 
GOCCATAACT 

AGTGcrccra 

QTGQCTCrGA 
ACCAC3AGCCT 
ACAAGGTCIG 
TAAAGGGAAA 
CAGTCAAACT 
AAGATGTCAA 
TGCTTATGCC 
TAGCrrCAGA 
ATAAAATAAG 
COGTGTGTTC 
CACT0C3CTGT 
AGCTGAACAG 
AATGGQTATC 
CTACACTCAT 
OGTAGTCCAA 
GTAGTGTATT 
GTCTCTCXTT 
CCCATGAAOB 
TTGGTC3VCAC 
CAACTOAAAT 
AAAAAAAAGA 
TAiCTTTAACr 
TCTGTGCTTC 



AATGACAAAT AAAASCG2WT 



600 
£60 
720 
78D 
840 
900 
960 
1020 
1080 
1140 
3L200 
1260 
1320 
13 6C 
1440 
1500 
1S60 
1620 
168Q 
1740 
1800 
1B60 
192Q 
19d0 
2040 
2100 
2160 
2220 
22fi0 
2340 
2400 
2460 
2520 



Seq ID MOt 436 Pyotftin sequence^ 
Protein AccefiSion #$ AAA599a7.1 

1 11 21 31 41 51 

I I 1 I I 1 

MSPSSAPPCR ZHVFHXSVLL TASMiTFWnP PTTAKUTIES TPSnVWaOKB VZiLLAHNLPQ 60 

NRIGTSfVYKG SKVDGI7SLTV GYVIGTQQAT FOFAYSGRBT iyPHASLl.IQ HVTQHDnSFY 120 

T£*QV1KSDI»V NEBA^TGQEHV YPElOTPSlfi SmgSKFVSDK DAVAFTCEFB VQJITTYLWWV ISO 

KQQSIOTSPR WJLSHGHWn. TLr,eVKRHDA GSYECEIQKP ASANHfiDPVT IJmiVGEDVP 240 

TISFSKANYR PGEHUIZiBai AASKFPAQYd WFINGTFQQS TQSLFIPHIT inmSGfiYHOQ 300 
AB198MCaUiR TTVTHITVSG 9AFVIi&AVAT WITIGVLAR VAIjZ 

Seq ID KOs 437 DWA aeguenjce 
Hudelc Acia Accession mB72B.l 
CSodlng seguence: 1355.. 1657 

1 11 21 31 41 51 

I I I I I 1 

QQAGCTCAAG CTCCTCTACA AAGAGGTGGA CAGAGAAGAC AGCRGAGACC ATGGaACCCC 60 

CCTCAlGCCCC XCCCT6CAOA TTGCATGTCC CCTGGAAGGA GGTOCZlGCiC ACAGOCTCAC 120 

TlXnAAGCIT CTGQAAOCXai OCCAOCACXG CCAAGCTCAC XATTGRATOC AOOaCftTTCA 180 

ATGTOGCAim GGGGAAOGAO STTCTTCTAC TOGCCCACAA OCTGCCCCAG AATGGTATTO 240 

GrrPACASCTG GXACAAASGC GAAAGAGTGQ ATGGCAACAG TCCAATTGTA GGATAIGIAA 30O 

TAGGAACICA ACAAGCXACC CCRGGGOOC3G CATACRGXGG TCOAGAGRCA ATATACCCCA 360 

AITGCAXOOCT GCTGATOCAG AAOGTCAOCC AGAATGACAC AGGATTCTAT ACCCTAGAAG 420 

TCA3%IUU3IC AGATCTTOTG AA^TGAAGAAG C3kACCGSACA GUTCCATSTA TACCGQGAQC 480 

tEGCOCAAGCC CTCCATCICC AGCAACAACT CCMCOOCGT G6AGGACRAG eATGCTGTOG 540 

CCITCACCTG TGAACCTGAS GTTCAGAACA CSiAOCTAOCT GlGGlXSGGTft AATQQM\fiA 6Q0 

GOCTC3GOGGT CAGTOOCAGG CTGCAGCTGT CCAATQSCAA CATGACCCTC ACTCTACTCA 660 

GCGTCAAAAQ OAACGATGCA GGATCCIATG AATGTGAAAT ACAGAACCCA GOQAGXGOCA 720 

ACCSGCZUSTGA COCASTCACC CTOAAriGTCC TCXAStGGCCC AOATOTCGCC AOCATTTCCC 7B0 

GCTCAaAGOC CAATTAGCSGT OCAGGGOAAA ATCTOAACCT CTCCTGCCAC GCAGC3CTC3A 840 

ACGCACCTGC ACAGTACTCT TGOmMCA ATGQGACGTT OCAGCSUVTGC ACACAAGAGC 900 

TGTTTATCCC CSVACATCACT OTGRATAATA GCGGRTOCTA TATGrGOCAA GOCCATAACT 960 

CAGOCACTGS CCTCAATAGG ACCAObSTCA GGATT3ATCAC AGTCTCTGGA AGIGCTCCTG 1020 

TOCTCTCAGC TGTGGGCACC GTOSGCATCA cmnTGG9U3T OCTGQOCAGG GTGGCTCTGA 1080 

TArCAGCAGCC ClGGlGTATT TTCGATAITT CAGGAAIlACT GGCAGATTGB ACCAGACGCT 1140 

GAATTCTTCT AGCTCCTOCa ATCOCATTTT ATGOCATGGA ACCACTAAAA ACAAGOXCTG 1200 

CTCTGCTCCr GAAGOCCTAT ATSCTGGAGA TGGACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCXrrCAGGC CTGAGGTGTG TGGCACTCAG AGACTTCACC TAACTAGAGA CAGTCAAACT 1320 

GCAAACCASCQ GXGAGAAATT GAGGACTTCA CACTASDGGAC AGCITTTOCC AAGATGTCAA 1380 

AACAAGACTC CT^TCATGA TAAGGCTCTT AOCCCCTTTT AATTTSTCCT TGCITATGOC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAilPATT TCACAAGAAG TAGCTTCAOA 1500 

GGGTAACTTA ACAGAGTGTC AGATCTATCT TGTCAATCCC AAOGTTTTAC ATTIAAATAAG 1560 

AGATCCTTTA GTGCACCCAG TGACTGACAT TaU3CAGCATC TTTAACACAG COGTGTGITC 1620 

AAATGTACAG TGQTCCTTTT CAOAGTTGOA CTTCTASACT CACCTGTTCr CftCTCCCTGT 16 BO 

TTTAATTCAA CGCAGCCStfTG GAATGCCAAA TAATAGAATT GCXCGCTAOC AGCT6AACAB 1740 

GGAGGAGTCT GTGCAGTTTC TGSUACXTGT T6TTGAACAT GGCT AAATAC AAXGQGlATC 1800 

GCTQAQACTA AGTTSTAGAA ATTAACAAAT GTGCIGCIXG GTTAAAATGG CTACACTCAT 1B60 
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CTQACrCATT CTTTATTCER. TTTTAGTTQQ TTTGTATCTT GGCTAAG6TG GGTAGTCCAA. 1920 

CTCTTGGT2«r TACCCTCCTA ATAGTCRTAC TAOTAGTCAT ACTCCCTGGT GTAGTGTATT 19B0 

CTCTAAAAGC TrTAAATGTC TQCATGCAGC CAGCCATCAA ATAGTGAATG GTCTCTCTTT 2040 

GGCTGGJyVTT ACftA2iACTCA GAGAAATGTG TCATCAC3GAG AACATCATAA CCCATGAAGO 2100 

J ATAAAAGCCC CAAATGGTGG TAACTOATAA TAGCACTAAT GCTTTAAGAT TTOSTCACAC 2l€0 

TCTGAGCTAS GTGAGCGCAT TGAGCCA5T6 6TGCTAAATG CTACA7ACTC CftACKSftAAT Z220 

GXTAftGfS^AS AAOKTAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 22 BO 

ACACAGOftGA TTCCRQTCTA CTTGAGTTAQ CftTAATACAG AAGTCCGCTC TACTTTAACT 2340 

TTTACAAAAA AQTAAOCXSA ACTAATCTGA TCTTAACCAA TGTATTTATT TCreTGGTTC 2400 

iU TGTTTCCTTG TTOCAA.TTTG ACAAAACCCA CTGTTCTTGT ATTGTATTGC CCRfiOGGGAG 2460 

CTaa?CRCTGT ACTTGTAGAS TGOTGCXGCT TTAftTTCRTA AATCSkCAAAT AAAAGCCAAT 2520 
TAOCrCTATA ACT 



15 



25 



eeq ID NOt 43a Protein sequence 
FTOteia AcceQslon #: AAA5990B.1 



X 11 21 31 41 51 

I > I I I I 

MDSFSQPVKT RLItlHIRLLP PFNLSIiWPA BFAKQDDAVI SISQEVASEG NLTBGQI'XLV 60 
NFiaVZiHKIKD SliVHPVTDIS SZFHTAVC9S VQnSP&BLDF 



fieq ID HDs 439 DMA sequence 
Nucleic Add AccessioiL #s Mie72e.l 
Godli^ eequencex 2370,. 2501 



1- 11 21 31 41 51 

I 1 I I i I 

GGftGCTCAnO CTOCTCTACA AAGAGGTGSA CftOAGAAGAC AGCAGAGACC ATGGGACCCC 60 

COTCAGCCCC TCCCTGCAGA TT6CRTGTC3C CCTGGAAOaA GGTCCTGCtC ACAGCCTCAG 120 

D\J TTCTAACCTT CTGGAACCCA OCCAC3CACTG CCaW3CXCRC TATIGRATCC ACGOCATTCA 180 

ATBTOSCBGA GGGGMU3(3AG GTTCrTCTaC TCGCCCAOlA CCTGGGCCSUS AATCQTATTG 240 

GTTACRBCTG GTACftAAOaC GAAAGAGTG& ATGGCftACaG TCrAATTGTA GGATATGTAA 300 

TAGGRACTCA ACAAQCTACC CCNSGOCCOCS CATACAQTOG TOGftGACSACai ATATACCOGA 360 

^ - ATGCRTCCCT GCTGATCGftG AAOGTCACCC AOAATGACAC W5GATTCTAT ACCCTACAAG 420 

TCATARAGTC /MSATCTTGTG AAT6AAGUVG CAACXIGGACA GTTCCATCKCA XACCOGGAGC 480 

nooccAAGcx: crccATcrcc aacAACAAcr ccaacoocgt GeAOGncAAc gasksctotog 540 

CCTTCROCTG TOAACCTOAG GTTCAX3AACA CAACC1!ACX:T STOGTGGGTA AAIGGTCRGA 60 0 

GCCICCSaSGr CRGTOCCAQG CTGCAGCTGI^ OCAATGGCRA CArrOAOCCTC actctactca 66 o 

Af\ QOGTCAfiAAG G2U«3GATGCA QGATOCTATG AATGTGAAAT ACAGAACCCA QdaACTGOCA 720 

ACGGCawarGA CCCAGTCACC CTGAATGTCC TCTATGGOCC AGATGTOOCC AGCATTTOOC 780 

CCTCAAftGGC C3UUETACGGT CCftGGGQKAA ATCIGAA£X:T CTOCTGOOUC GC3UGCCTCIA B40 

ACOCSICCTGC AC3U3TAjClCT TGGTTTATCA ATGG(3U3GTI CCAGCAATOC: ACACAAOAQC 500 

TCTTTATCOC CAACATCACT GTGAATAATA GOGGAIPCCTA TAT6TQC3C3A GCKCATAACT 9fl0 

CaGOC3U2TGG CCTCAATAQG ACCACAGTCA CGATGATCAC fiGTCrCTGGA AGKSCTCCTG 1020 

TCCTCTCAGC TGTGGCCAa2 GTOGGCATCA OSATTGGAGX GCTGGCCAGG GTG6CTCTQA 1080 

IftaCAflCAQCX: CTOSrarATT TTCGKITATTI: CnCGAAGAiCr GGCaCATTGG AOCaiGACOCr 1140 

QAATTCTTCr AOCTCCTOCSA ATCXTATTTT AWXSCATGGA AOCACTAAAA ACRRQGTCTQ 1200 

CTCTGCTCCT GAAGCCCTAT ATGCTGOftOA TQCACAACrC AATGAAAATT HAAAGGGAAA 1260 

<{\ ACCCTCAGGG CTQAGGTGTG TGCCACTCPG AGACTTCACC TAACTAQAGA CAGTCAAACX 1320 

DU GCAAACCATG GTGAGAAATT GACGACTTCA CACTAOXSG&C AflCXTTTCCC AAGATOTCRA 1380 

AACAA^^TC CrCarCATGA TAaOGCTCTT ACOCCXATTT AATTTGTOCX TOCTTATOCC 1440 

TGCdCTTTC GCTTOSCAOa ATGATGCTOr CATTA5TAIT TCaC*AAaAAG TAGCTTCAGA 1500 

GGGTMCTTA ACRGRGTGTC AJSATCTATCT TGrCAATCSCC AAOGTTTXAC ATAAAATAAG 1560 

AGATCCTITA GTCCACGCAG TGACTGACAT tTAGCftGCATC TTTAACACIAG OOGTClGTrC 1620 

33 AAATGTACAG TGGTCCTTTT CAGAi^lMQA CTTCTiySACX CACXnJGrTCC CACTCCC7GT 1680 

7TZAATTCAA OOCftg QCAaf G CSAATGCCAAA TAATAGAATT GCTCSCCTACC; AGCTGAACAG 1740 

OGASGAGTCT GlGCAOrTTC TGACAClTGr TGTTGAACAT GGCTAAAXAC AAIOGGTATC 1800 

GCTGAGACm AOTTGrAGAA ATXAACRAAT GTQCTGCTTG GTTAAAATGG CTACACTCyiT 1860 

CTOACTCATT CTTTATTCTA TTTTAerTOG TTTCTATCTT GCCTAAGGTG OGTAOTCCAA 1920 

OU CTCTTGGTAT lACCCTCCTA ATAQTCATAC TAGTAGTCAr ACTOOCTGOT GTASTGTATT 1980 

CTCrAAARGC TTTAAATEaTC TGC&TSCaWSC CAGCXamaA AXACnTQAATG GTCTCTCTT T 2040 

QGCTOGAATT ACAAAACTCA GAGAAATGTG TCATCM3GAG AACATCATAA CGCATGAAGG 2100 

ATAAAAGCCC CAAATGOIGG TAACTGATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 2160 

TCTCROCTAG GTGAOCGCAT TGAGCCAOTG GIGCTAAATG CTACATACTC CAACTGAAAT 2220 

03 GTTAAGGAAa AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACRGdnOA TTCCftfiTCTA CTTGA6VTAO CATAATACAO AAOTOCCCTC •CACXTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCIGA TSTXAACCAA TGTATTTATT TCTGTGGITC 2400 

TGTTTCCTTG TTCCS^TTG ACAAAA0CC31 CTGTTCTTOP ATTGTATTGC OCftflaGGGAG 24«0 

_^ CrATCACTGT ACTTGTAGAG TGGTGCTGCT TTAATTCAXA. AATCACAAAT AAAAGGCSUVT 2520 
/U TAGCTCTATA ACT 

Seq ID laot 440 groteln accmence 
Pxotein Acceasi<»i #i AAA59909.1 

75 1 la 21 31 41 51 

1 I I t I I 

HLrmVFZSW IiFPCSNI*TKP TVLVXaYGFGG AZTVZjVEWCC ElffS 

80 Seq ID SlOf 441 JMA Bequfence 

miclolc Acid Accession «s MH_002381.2 
Cbdiag seqaence: 64. .1524 

1 11 21 31 41 51 

1128 
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AAATCCOAGC CTOGOGTGGG CTCCTOGCCC COGACGGACA CCACCAGGCC CAOOTAGCCC 60 

ACCATGCCGC GCCC3GC5CCCC CGCGOGCCGC CTCCCGGGAC TCCTCJCTQCT QCTCTGGCCG 120 

CTGCTQCTGC TGCCCTCCGC CGCCXICCGAC CKCGTGGCCC GCCCGGGCTT CCGQRSGCTG 180 

GAGACCCGAQ GTCCCOQGGG CAGOCCTGGA GQCCGCCCCT CTCCTOCOQC TCC0GACX3GC 240 

GCQCCCQCTT CCOGQACCAG OBAGCCTGGC OSCXSCCX^CXS GXGCAGGXGT TTGCAACAaC 300 

AGACCCTTGa ACCTGGTGTT TATCMTTGAT AQTTCTCGTA GOGTAOGGCC CCTGGAAXTC 3fi0 

ACCAAAGTGA. AAftCTrTTOT CTCCCGGATA ATCGACACTC TGGACATTGG GCCRfiCOGAC 420 

ACacaO(3TGG CAGTGGTGAA CTATOCTAIJC ACTGTGAAGA TCGAGTTOCA ACTCCfttJGCC 4 BO 

TACACAGATA AGCftGTCOCT GAA0C2^G(3CT GTOGGTCOAA Tq3U3ACCC7T OTCAACAGGC 540 

ACCATOTCAG GCCIAGCCKT CCAiSAGAiOCA ATGGAOGAAG OCTTCAGAGT GOAGGCAGGG 600 

GCTCGAjSASC CCTCTTCTAA CATCCCTAAG GTQGCCRTCA TTGTTACAGR TGGGAGGCCC 660 

CRGGACCAGG TGAATG?iAGT QGCGGCTOGG GCCCAAGCAT CTGGrATreA QCTCTATGCT 720 

GTGGGC<STQQ ACCGGGCAGA CATQGCGTOC CrC3iAOATGA TGQCCaGTGA GCOGCTAiSAa 730 

QAOCRTGTTT TCTACGTGGB^ aAOCTAT{3(30 GTCATTGAGA AACTTTCCEC TAOATTCCAG 840 

GAAACCTTCT GTQOGCTGGA CCCCTGTGTG CTTGGAACRC ACCAGTGCCA GCAOGXCXGC 900 

ATCAGTGATG QGGAAGGCAA GCACCACK3T GAGTGTAGOC AAGGAXACAC CTTQAATGCC 960 

GACAftGAAAA CGTGTTCAGC TCTTCATAGG TG-EGCTCTTA AQACCCAOGG ATGTGAGCAC 1020 

ATCTGTGTGA ATGACAGAAG TGGCTCTTAT CRTTOTGAGT GCTATCiAftGG TTATAOCTTG 1080 

AATGAAQftCA GGMMCTTG TTCAGCTCAA GATAAATC^ C7TTGG6TAC CCATQGGTQT 1140 

CAGCACATTT GIGIG2WI6A CA6AACAGG6 TCOCATCATT GTGAATOCTA IQAOGGCTAC 1200 

ACTCTGAAIS CAGATAAUVA AACATOirTCA GTOCGTGACA AOT6TGGCCT AOSCTCTCAT 1260 

GOTTGCCAGC ACATTTGTCT OAOTQATGGC? GCOGCATCCT ACC!AC3X3TQA TTSCTATCCT 1320 

GGCTACACCT TAAATOMSGft OUiGAAAACR TGTTCftaCCA CTGACSOAAGC AOGRAGACTT 1380 

GrTTCCACTO AAGATGCTTS XGGATE3TSAA QCTACACTGG CATTGCAG6A CAAGOTCASC 1440 

TCGTATCTTC AAAQACTQ&A CACThAACTT GAIGACATTT TQaaOAAOTT GAAAATAAAT 1500 

GAA^WIGGAC AAATACftTCG TCAAATTGCT COUkTTTCTC ACCTGAAAAT GTGGACAQCT 1560 

TGGTGTACTT AATACTQ^TQ CATTCTTTTG CACACCTGTT ATTIIGCCAATG TTCCTBCTAA 1620 

TAATTTGCC& TTATCTOTAT TAATOCTXGA ATRTTACTOa ATAAATTGTA TGAASATCTT l6ao 

CTGCAGftATC AGCATGA3rrT TTCCAAOOAA ATACATATGC AGftTACTTAT TAAOAQCAAA 1740 

CITTAGIGXC TCIAAGTTAT GACIGIGAAA ^GATTGGTAa GUUVTAGAAT GAAAAGTTIA IBOD 

QTQTTTCTTT ATCTACTAAT TGAGCCA.TTT AAXITTTAAA TGTTXATATT AOATAACCAT 1860 

ATTCACAATG GAAACTTTAG GTCTAiSTTTC TTTTGATAQT ATTTATAATA TAAATC^ATC 1920 

TTATTACTGA GAGTGCAART TQTACAAGGT ATTTACACAT ACAACTTTdAT ATAACTGRGA 19 BO 

TGAATGTAAT TTTGAACTGT TTAACACXtCT TTGTTmTO CTTATTTTGT TGCAGTATTA 2D40 

TTOAAGATGT (^aCMOfiGli TTGTAATACA CATATCTAAA AATAGtrTAAC ACIU3%7CM6 2100 

TGAACATTAC ATTGOCATTT TTAATTCATT CTGGTCTTTG AAAGAAATGT ACTACTAAAS 2160 

AOCACTASTT GTGAATTIM OQTGTTAAAC TTTTTAOC3UV> GTACAAAART CCCAAATTCA 2220 

CTTTATTATT TTGCTTCRGG ATCCAflGTGA CARAOTTATA TATTTATAAA ATTGCTATAA 2280 

ATCI^CAAAA TCTAATGTTG TCTTTTTAAT GTTAGTGATC CACClGCiCiC AaCCTCOCAA 2340 

AGlrGCTGGGA TTACAGGCTT GAAAGTCTAA CITTTTTTTA CTTATATATT TGATACATAT 240Q 

AATTCTTTTG GCnTGAAAC TTOdUVCTTT GAGAAGAAAA CAGXCCTTTA AATTTTGCAC 2460 

rtGCTCAATTC TGTTTTTOGT TTGCATXSTC TTTAIOATAA TAAAAGTTAT TACCSTmCA. 2520 

TATTATCATG TCTATTTTTO ATQACTCATC AATTTTGTCT ATlAAatOATA TTTCTITAAA 2580 
TTAAAAAAAA AAAAAAAAA 

Seq ID BOt 442 Protein aeguence 
Protein Accefislon #> HP_002372.1 

1 11 21 31 41 SI 

I I 1 I I 1 

MntBAFaSBZ, PGLZ.LLLIIPL I.LIiP9AAX>0P VmSGFBXLB TKGPGGSPOR BP8PAAFDGA 60 
PASGTSBPOR ABQAGVCKSR PU3I.VFI1DS SR&VKPLEFT KVITTFVSRir DTLDIGPAZ>T 120 
EtVAWHYASr VKIEFQLQAY rtSKQSLKQAV GRXTPIiSTGT MSSLAIQTAM DEAFTVBAGA 180 
BEP3SHIPJCV AirVTDQREQ DQFVHEVAARA QASGIEZ.YAV GVDBADHASIi KHMA8EPL&B 240 

HVFirvBneav ieklbshfos TECAUspcvia gthqcohvci sdgegkhbcb csQOKrxHaD 300 

KKTCSAUmC AXiBTIBGCEHZ GVHDRSGSXH CBCyBSyTin BURKTCSAQD KG3U£XBGOQ 360 
HICVSIQEZXGS RHCSCyBGyr XMADKICZC37 SDKCALSSBG OQaiCVajGlk ASVHCSlCyPG 430 
YTIJIEPKKrC SATBEABKLV ST&D7WGCEA T£AFQp(KVGS YUOBUSngUi DILBK&KXiaK 480 
YGQIKR 

Seq ID HOs 443 DMA oecruence 
srocleic Acid Accession #i im_oi6639*l 
Ooding segnuence: 40.. 42 9 

1 11 21 31 41 

I I I I I 

GCOGCSGGGDG CAGACMCJGG OGGG0GCA6G AGGTQCRCTA TGQCTCGGGG 
CSGTTGCIGC OSCTCCTCSGT tJCTOGGGITrC TGGCrGGCGT TGCTGCGCTC 
GAGCAAGCX3C CAGGCACOGC COOCTGCTGC GGCGGCAGCT CCIGGAGG5C 
. AA0IGCATGS ACTGCQCSIC TTOOUSGGOG GGAC03CACA GCGACTTCIQ 
GCTGCAGCAC CTCCTGCCOC CTT CO GGCTG CTTTaOCCCA TOCTTGGGGG 
CIQACCTTOG TGCTGOGGCT GCTTTCTGGC TTTTTGGTCT GGAGACGATQ 
GAGAAGTTC3V CXa^CCCOCAT AGAGGSRQftCC OGCGQAGAGG GCTGOCCAGC 
ATCC3U3TGAC AATOTGCXICC CTGOCAGCCX3 GGGCTGGCCC ACTCATCATT 
TTCTAGAGC3C AGTCTCTGCC TCCCASAOaC (aaGGGaAjaCC AAGCTOCTCC 
CSGGGTOGGGG OGGGTGAATC ACCTCIGAGG CCTGGGGCCA OGCSTTCnSGG 
AG6TGTCTGG TTGCCCTGCC TCTG6CTCC3W QAACAGAAA0 OGAGCCTCAC 
ACAAAACAGC TOACACTQAC !rAA£3GAACrG CAGCATTTGC ACAG06GAGG 
CCTTCCTTAG GACCTOGGGG CCaUSGClGAC TIGGGGGGCA GACTTQACAC 
TCACICAGAT GTCCTGAAAT TOCACCSUIBG GGQTCftC30CT GGGGGGTIA6 
TTAACACI3\6 GGSCIGGCGC ACXAGGAGGG CEGGCGCTAA QATACAOACC 
CCCAAAGOSG CaGAGOAGATA TTTKTTTTGG GGAGAGTTTa GAGOGGAGGG 



I 



CTOGCTGCGC 


60 


CGTGG0C6G6 


120 


GGAOCIGOAC 


IBO 


CCTGGGCTGC 


240 


CGCTCIGAGC 


300 


CC6CAGGAGA 


B60 


TGTGGCGCTG 


420 


CATTCATCCA 


480 


AACCACAAGO 


540 


GAACCITCCA 


600 


GCTGQCrCAC 


660 


GQGGTGCCCT 


720 


TASGCCCCAC 


7B0 


GQACCTATTT 


840 


CCXXXAACTC 


900 


AOAATTTATT 


960 
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A»TAhA2hGAA ICITTMCTT TJUUUAAAAA AAAAAAAA 



Seq ID UOi 444 Protein eeguence 
Protein AccuBlcan fts HP_0 5772 3.1 

L 11 21 31 41 51 

I I 1 I ] t 

KARG8IJRRT.L RL&VLOLHIA ItliRBVAGEQA FGTAFCSRGS SHSADliDKCK DCA8CRKR9K 
SDPCLGCAAA PPAPFRLLHF ILGGAI^TF VLQLLSOFLV HRRCEIRREKP 
OCPAVAIiIQ 



60 
12 D 



Seq XB NO: 445 PHA aeguemce 
Nucleic Acid Accession tt: AF322916-1 
Coding Beguencei 50.. 4300 

1 11 21 31 41 51 

I I I I I I 

GCACTCOSCA (KSCTTXMQO T7CGCGGG6G GGGCftOGCAA GAOTTABCCA. IGhAQJUSCCT €0 

CAABTCQCGC CTGAGGAGGC AtSQACGTGCC aSGCCCCQGG T06TCIGGCG CCGCX^CCGC 120 

CAGCX30GCAT 6CAGCAOKTT GOAATAAATA TGASTGACCGA TTGATGAAAQ CASCAGAAAG IGO 

GOQaaATQTB. (SJU^AAAGTGA CCTCRATCCT TGCTAAAAACS GGGCSlCftATC CAGGCAAACT 240 

AGATGTGGAA OGCAQATCIG TCrTCCATOT TGTGACCTCA. AAGGGQAATC TTGAGTGarTT 300 

GAAIGCCATC CTXATACATG GA67TBAXAT TACAACCAGX GAOkCTGCAG GGAGAAAXGC ^€0 

TCTTCACCTG GCrOCTA&GT ATGGACATGC ATTGTGCCTA CAAAAACTTC TACA6IACAA 420 

TTGTCCCACT GAGCATGCAG ACCTGCAGOa AAQAACXGCA CTTCAAAAAA AAC?CAATGGC 480 

AGATTGTCCT TCTAQCATAC AGCTGCTTTG TGACCATGOT GCCTCTGTGA ATGCCAAAGA 54€ 

T&CAAAAGG6 CGGACACC^C TTQTTCTCKX: TAClTCAGATG AGTAESGCCM CAATATGTCA 600 

AdQCTOATA GATAGASSnS GGGATGTTAA TTCCRGAOAC AAACAAAACA GAACIGOCCT «60 

CATGCTAGGT TGCQAATATQ OTTGCAGAGA TGCAGTAGAA OTCITAATTA AAAAOlOGTGC 720 

TOATATAAGC TTGCTGGATG OGCITGfSCCA TOATAGTTCT TACTATGCAA GAATTOCJIQA 7B0 

CAATCTGQAC ATTCTAACCT TGTTGAAGAC TGCATOGQAA 7ATAC0AACA AAQG6AGA6A 840 

ACTTTGOAAG AAAfiGACCSiT CCTIGCAACA GCQAAATTTTQ ACACACAT6C AA6ATOAAGT 900 

AAATSIGAAO TCACATCAIQA GGOnSaviCA AAATKTTCA6 GATTTOaAGA TTGAAhAl3GA 960 

ASSATTTGAAA 6AGAGGTTGA GAAAAATTCA QCSU^SAACAA AQAATACTTT TGGATAUOT 1020 

CAATG6TTTA CAGTTACA0C TQAAVGAGGA AGTTATGGTT GdOATQATC: TGGAAAQGGA 1060 

OASABnnAAG CTGAASTCCC TTTTOGCAOC TAAAQAAAAG CAACATGAA6 AAAGCTXAAQ 1140 

GACTATCQAO GCTCXQAAAA ATAGATTTAA ATAITTTGAG AQTOATCATT TAGGATCA06 1200 

AAGTCATXTC AGSftACCSGAA AAGAAOATAT QCTKZIXAAA CAAGGICAGA TCXKTATC3GC 12fi0 

AGACTCACAa TOrACrTCOC! C3USG1ATAGC A6GCCATATG CAAAOCAOAT CXArrQilSAfUG 1320 

ACinX^GGAA CTATCTTTflC OCAGTCAAAC GTCT.TAOTCT GAAAATGAAA TTTTAAAGAA 13 HO 

AGAlSTTAGAA GCAATGOTRA CTTTCMTGA GTCAGCAAAA CAAGACCaSAC TOTUVSCTOCA 1440 

AAATEQAACIG GCACACAAAG TGGCAGAATO CAAAGCTTTA GCATTAGAAT GrGAAAGGOT 1500 

CAASOaSGZa TCAGATQMC AlQATAAftSCA XITKSMiQK£ OCATTAAAAO AUGXQCAGftA 156 0 

OAGCSAXGXAI GNSfSCMSKMS 6TAAAGTTAA ACmaTOCAlS ACCGATITTC TTGOOCTTAA 1620 

AGAACACTTA ACAAGTGAAG CAG0CTCAd3Q GAATCACAGA CTAAG06AGG AACTOAAGGA 1680 

TCAlGTTOAAA QRCTTGAAAG TAAAATATGA AGGTGCTTCA GCAaAASTGG OSAAATTAAG 1740 

AAACGAAATC AAAiCAAAATG AC?ATaATAiST AGAAG2USTTT AAGAGGGATG AA£3GCAAi3CT 1800 

OATM^AAQAA AAfAACSGGMT TACAGAAGSA ACTTM?TATO T&CQMKSGG A6CSAlSAC»A 1B60 

GAAAGGAAGA AASGTCACAG AQATaSAAQG CCSUSGCAAAA GSMTTGTCAQ CGAAOTTGGC 1920 

CCTTTOCATT CCAGCTGAAA AATTTCAAAA CAT13AAGAGC TCAmATCAA ATGAAGT6AA 1900 

TGAGAAAGCA AAAAAAT37U3 TAQAAATGGA AAGAGAACAT GAAAAATCAC TTAGTGAAAT 2040 

TAGACSU3TTA AAGA6AGAAC TTGAGAATGT TAAQQCCAAG CXTGCTCAGC ACGTCAAACC 2100 

AGAGOAACAT CTAACABffmt JM3&GCSSGAIT AiGAAC»GAAA TCAGQAOAAC TTGGGAAGAA 2160 

GATCACTGAG TCAAiCAirTGA AAAATOUQAC ACTAGAAAAG GAAAITGAAA AAGTTTATTT 2220 

GOATAATAAG CiCCTCAAlSG AGCAAGCACA TAACXTAACA ATTQAAATGA AAAATCATTA 22 BO 

TGTTCCTTTA AAAGTAAGTG AAGAC7WTGAA AAAGTCACAT GATGCAATTA TTGATQATCT 2340 

TAATAGAAAG CTTTTAGATG TAACACAAAA ATATACAQAA AAQAAOTTOG AAATOGAGAA 2400 

ATCGCIACTG GAAAATG3UCA GCTTAAdTAA GGATGXAAGC GGCCIASAAA CTOTGTTTGt 2460 

ACCTGCTGAa AAACATCttkAA AAGftGKTAAT AaCTCIOAAA TCX3Ua»l!CTG TTGAACTTAA 2520 

GAAAdCASCTG TCTGAACXTA AQAAAAAATG TGGTGAAGAC CAGGAGAAAA TAC!A£GCl!Cr 2580 

CACA^TCXGAA AACACTAACT T6AAGAAGAT QKTQldSSAKt CAGXAXGTGC CAGTTAAAAC 2640 

CCATGAACSAa ^ZTAAAATQA C3UCTGAATGA CAOGTTAGCC AAAACTAACA GAGAATTATT 2700 

AGATGTGAAG AAAAAATTT6 AAGATATAAA TC3U3GAATTT GTAAAAATAA AAGATAA£3AA 2760 

TQAAATATTA AAAAGAAACC TGGAAAACAC TCAGAACCAA ATAAAAGCTG AGTACATCAG 2B20 

CCTOGCAGAG CAOGAGGCAA AOATOAOCTC GCTAAGTCAG AGCATGAGAA AGGTGCAGGA 2680 

TAQTAATOCT GAAATCTTGG CCAACTACAG AAAAGGCCAA QAAGAQATT6 TGACACTGCA 3940 

TGOOGAAATT AAAGCOCAGA AQAAGOAGCX OGftlCACAATA CAAGAAIGCA TTAAC3QTAAA 3000 

KSKCGCCaOL ATEGTCAGCZ TTGHSGAGTO OSAGAOAAAA TTTAAAdCZAA CAGAGAAAGA 3060 

ACTAAAAGRC CAGXXATCnS AGC3bQACAC3l AAAQTATAGT GTCAGXGAAG AAGAAGTCAA 3120 

GAAAAACAAIS CAAGAGAATQ A£!AAGTTAAA GAAG6AGATT TTTAOCCTTC AiSAAAGATTr 31 BO 

GAGAGATAAG ACAGTTCTCA TTQAQAAGTC TCAT(3AAATG GAMiSfHSCKT TAAGCAGAAA 3240 

AACAGAOGAG CTAAACAAAC ASITAAAAGA CTTGXCACAG AAATACACGO AAI^A2\AGAA 3300 

TGIGAAAGAG AAGCXAGTAG AAOAAAATGC CAAACAGACT TCTGAGATAC TTGCAGTGCA 3360 

AAATCTTTTG CAAAAACAAC ATGTTCCATr G6AACAGGTT GAGGCXCTGA AAAAAXCTCT 3420 

TAATGGCACA ATTGAAAATC TAAAOGAAOA ACTGAAGAGT AXGC3UUVGGT GTTA0GAC3AA 3480 

AGAQCAGCAG ACAGTGACCA AACTGC&TCA ATXGTTGGAG AATCAAAAISA ACTCTTCXGr 3540 

ACCCCIGGCA GAGCATTTGC AGATTAAAOA AOCATITGAG AAAGAAGTTG GAATCATAAA 3600 

JU3CCAGCTT0 AGAGAAAAGG AAGAAGAAAG OCAAAACAAA A7GGAAi3AA<7 TCIOCAAACT 3660 

TC7U3TGGGA6 GXTCAGAATA CTAAACAAGC ATTAAAAAAA tCaTAGAGACTA GAGAGGTAGT 3720 

TGACITSICT AAATATAAAG CAACAAAAAQ TGATTTGGAG ACAC3U3ATIT CTAGCTrAAA 37B0 

TGAAAAATTG GCCAATCTGA ATAGAAAGTA TGAGGAAGTA TGTGAGGAAO TTTTGCZATGC 3840 

CAAAAAjQAAG OAAATATCTG <MAAQATaA QAAGGAATTA CTGCATTTCA GCATTGAGCA 3900 
ASAAATTAAG GATCAGAAGG AAOGATOXGA rAAGXCCTTA ACAACAATCA GftSAGrTAOL 3&60 

AAOAAGAA'lrA C3UU3AAl£CTQ CTAAACAAAT Af3AAGC3U\AA GATAATAAGA TAACTQAACT 4020 

1130 



wo 03/042661 



PCT/US02/36810 



GCTTARTGAT GrGGAAAQAT TAAAACAGC3C ACTCAATGGC CTTTCCCAAC TC31CCTACAC 4D80 

AAGTGGGAAC CCCACCAAGA GGCAOAOCCA GCTQATTQRC ACICTGC&GC ACCAAGTGAA 4140 

ATCTCTGGAlG CAACAiSCIGG CC6ATGCTGA C3UC32U»GCAC CAA6AA&TAA TTGCaATTTA 4200 

TCOGACACAC CTTCTTAGTG CXOCSLCafiOG TCACATGC3AT OAAOAT&TTC AGGA<3GCTCT 4260 

GCTCCAGATC ATACAAATGC 6GCAGGGGCT TGTGTGCTAG COGTTAGCAC TOACTGCCAG 4320 

TATCTGTTTT ATCTTGCTGG TGCTQAACAT TCTTTQTGCSi ACTCDCATGQT CTTTCXGGGC 4380 
CTTACTGTGG TCtSTATAATT AAAATAAAAX ATATTTTGTT CTG66TGT 
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SeQ ID NO* 446 Prpteln aeoaence 
Protein Accession tt: AAG49577.1 



FGKU3VSGRS 
I^YNCPTEHA 
TICQIiLIDHG 
RZGDHLDILT 



BSLRTIEALK 

BUaX)!UKDIiK 
EREKKGRKVT 
Ii&BIHQLKSB 
KVyXDlIKI.I.K 
1*! M1gifT>TJ.1l?lilT^ 



11 

I 

QDVP6PAS6G 

DLQQRTALQK 
ADVNSBDKQ^ 
LLKTASBMTir 
SKIQQBQRHi 
mFKYPESDH 



21 
I 

AAAASAHAAD 

IiECliSlAIIiIH 

RTAIiMLGCBY 



QIKQIiQDALK 
VKYBBASAEV 



VTLHAEIKAQ 
EEVKKNKQSH 
EVKNVKEKIjV 
Cy£KBQQTVT 
V^KZjQSEVQBI 
VUIAKKKEI8 
ITEIiIiMDVER 
lAIYRTHUiS 



LENVKAKIiAQ 

SLSKOVSHIiE 
IiKKHKQNQXV 
LEHTQ^IKA 
KKBUyriQEC 

EECIAKQTSGX 

TIQQAZiKKIiBT 
AKDEKELIiHF 
IiKQAI^IiSQ 
AAQGHMDEDV 



31 
I 

WMKyODRLMK 
OVDITTflDTA 
QLIiCDHGASV 
BCRDAVSVIjZ 

USKUXiaiJQLQ ZJXEEVKVADD 
LGSGSHF£NR KKDMLLKQGQ 
ItiKKELEAHR TFCESAKQDR 
ISVQKSMYB&B GKVKQMQTHP 
GKUUiUlKUN 
AKIiKLSIPAB 
HVKPBBHBQfV 

KNayvpuws 

TVFVPFEKHB KBlIALK&ia 



41 
I 

AASRGDVSKV 
QRNAUHAAK 
IflAKDVKGRTP 
BHOADISLLD 
QCEV&IVKSHQ 
ZiBSBREKUCS 
MYMADSQCrS 
IiKLQHSLiAHK 
LAIiKBHLTSB 



SI 
1 

TSUiAKKGVN 
yOHALCLQKr^ 
LVLATQMSRP 
AliOHDaSYYA 
REHQNIQDLB 
UAAKBHOUB 
FGIPAHMQSR 
VAECKAIAIiE 
AASQ!3HRIiTE 




LGKKITBCIL 
ICDLNRKIiIiD 
VBLKKQI«6EL 



ETZSIAERE& 
IKVKKAPrVS 
QKDiLRI>KTVIj 
IiAVQtlLIiOKQ 



6IEQEISDQK 
I/nfTSOMPTK 
QEAIiLQIIQK 



XBKSHEMERA 
HVPI.EQVERI» 
QIKSAFEKEV 
ATKSDIi^SX^X 
EEICDSCBLTTX 
KaSOLIXTri^ 
KQ6LVC 



mQDSNABU* 
TEKELKDQIiS 
LGRKTDELNK 
KKSLRGXIBH 



KBram.QKEIB 
VTQKYTEKKL 
KKRC3QSDQB1C 
BDII9Q&FVKI 
ANysnSQiBEI 
BQTQKlfSVSE 
QLKDIiSQKYT 
LISELKSMQR 



TBLQRRIQB8 AnQISAXDHK 
EQfVKSIiBQQL ADnDROHQEV 



Soq ID nOs 447 DNA geongQca 
Nucslele Add Acceeaion #i 9HJD03O2O.1 
Coding sequence: 29..C64 



CGCICCTOGG 
TQGCCTACTG 
OCCTGACGGG 
GGOCATIQOC 
CCAGAGCATT 
eC3CCAACATC 
OGGGTAOCCA 
AAACACCCCT 
'SCaStSMCKI 
GAAGGGAGGA. 
QCaaXAATOTT 
GTAAAGAGAA 
7GCAOGTGTA 
iRCaVAAGCAG 
AAATTAGAAT 
TAAAAATTAA 
TQCAGTTTAA 
TTXGATTTTG 
GCTGTACrCA 
GCAAGCAmS 



11 
I 

QCTSCCCCrC 
TTTTGGCTGG 
GTCTCAGAAG 



GTGGCAOAGT 
GACXXrrCCAA 
OACACTGCAG 
GACTATCCAG 

aAtszusnosAiw 

GTTOCaiAAGA 
GATGCT AGAC 
AATGGAOTOC 
CTGTATQTAQ 
AAOAQCrXTT 
AATGIGAASXS 
AATQGT^TCT 
ATTATGTAGT 



21 
I 

GGTTGACAAT 
CATCTOGATG 
CAGATATCCA 
TGQUWTKTCC 
CCCATGAAG6 
TOACrocjASA 
ATOOCTGTGC 

GCTTGGGCMV 
AGCGSnGGAS 
AGTCTGTOOC 
GAAA&COCAC 
CTGTGAATGA 
ATRGTtSTATT 
TTOTTTCTTQ 
TCAACAATAA 
GAGQTrQTAC 
TCATCCAGOC 



31 
I 

GGTCTCCAGG 
GACTCCAGCA 
GAGGCTGCTT 

ACTTC2USC3\T 
CAACATTCCT 
T6TTGGAAAA 
AGAdTTCCAG 
6TGGAACAAG 
TGTCAATCiCA. 
CCATTTTTCA 
ATTAOCTGTT 
CAGCATGTTT 
G^DCtTCACAC 
GGTTOTTAAA 
AAAGCAAGAC 
TATTTTQGOC 
CTTGGGCATT 



41 
I 

atqgtctcta 

TTTGCTTACA 
CATGGTGTTA 
QCCATQAftIC 

AAigORCXTTA 
ACAGATGAIG 

TT(3ou::cAac 

JAACICCTTT 

GATQAGGATA 
AGGCCTCAGC 
CTTACATAGA 
CQATGRTTCr 
ATGTGAATCT 
TATGAZU^OGC 
AAGTCTGTA9 
GTTATAICACC 
GCGICITAAT 



SL 
I 

CCATC3CTATC 
GCOOOOGGAC 
TGGAGCAATT 



GTGAGGATCA 
GATGTCTAGA 
ATCTCXTTG3V 
AOSAGAAGAT 



AOGATOCAGA 
ATGSCTTATG 
TAATTATQGA 
QCTTTTTGCT 
GCAATGATCA 
TCAGATTTCX 
AAAGCTOTCA 
ABTAAAGAAG 
AAA£!A*ZQAAT 



Seq ID NO I 448 Protein oeguence 
Protein Accesaion «i NP 003011.1 



51 



60 

120 
160 
240 
300 
360 
420 
480 
540 
GOO 
660 
720 
780 
840 
90O 
960 
1020 
1080 
1140 
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1260 
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13 eo 



60 
120 
180 
240 
300 
360 
420 
480 
540 
€00 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 



1 11 21 31 41 

[ I I 1 I 

MVfifiMVSTML fiGIiZiFWIiASa WTFAFAYSPR TPDRVSBADT ORIiI>H0VMBQ IiGIARPKVBlT 
PAHQAMHIiVG PQSIBGGAHB GLQHLGPFGN lElillVABLTG DNIPKDFSED QGyPDPPIflPC 
PVGKXDDGCli BHTPDTABFS SBPQZiBQBLF DPBHDVFOLO KHlIKimiiirEK MKBGEIIRKRa 
SVHPyi.QGQR LDtlWAKKSV VHFffl^EDKDP B 

s«q XD NO£ 449 DMA fieooenee 

nucleic Acid Accession tts HH_003B16.1 

coding sequences 79.. 2536 

1 11 21 31 41 SI 

I I I t I I 

OGGCAGQQTT GGAAAATGAT GGAAGRGG06 GAGGTOGAQG CGACOGAGIG CTGAGAGGAA 
OCTSO QGftAT OGGCOSAGAT GOOGTCTOGC GOGGSCITTC OCrOOGOGAC CCTTCGXGTC 
CGGTGGTTGC IXsTTGGTKOG CCTGGTOGGC CXAGTCSCTOS GTGCBGCGOG GCCAGGCnT 

1131 
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120 
180 



60 
130 
180 
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CAACAGACCT CACATCTTTC TTCTTATGAA ATTATAACTC CTTGQ7U3ATT AACTAGAGAA 
AGAAeAQAAG CCX:CTA!QGCC CTATTCAAAA CftftGTATCTT AT6TTATTCA QGCTGAAGOA 
AAAGA6CATA TTATTCACTT QOAAAGGAAC AAAiQAiCCTTT TOCCTGAAfiA TTTTGTQ3TT 
TATACTTACR ACAAGGAAG6 6ACTTTAATC ACTGACX:&TC C2CAATATACA OMTCATTOT 
aVTTATOQGG GCTAT6TGGA GGQAGTTCaT AKTrCPiTCCA TTQCTCTTAG CJCSACTGTTTT 
GGACTCAGAQ GATTGCTGCA TTTAGAGAAT GCGAGTTATG GGATTGAftOC CCTGCA6AAC 
AGCTCICATT TTGAGCACAT CATTTATCQA ATOGATGATG TCXACAAAGA OCCTCTGAAA 
TQTGCSA0T1T CC3UU»AGQA TATABAGAAA GAAACTGCAA AOOATCAAGA GGAAGASCCT 
CGCZUSCATGA CTCAlGCTACT TCGAAGAAGA AOAGCrQTCT TQCCACAGAC OOGGTATGTQ 
GAGCTGTTCa TTC5TCQTAI3A CflAGGAAAQG TATGACATSA TOGGAAQAAA TCAtSACTGCT 
CrTGAGAGAAG ASATGATTCT CXTTGGCAAAC TACTTGQATA GTATGTATAT TATC5TTAAAT 
ATTCGAATK3 TGCTAGTTQO ACTGGAaSATT TSQACCAATG GAAACCT0AT CAACKTAGTT 
GGGBGTGCTe eiGAXOTGCT OGGSAACTTC GTGCAGPT6GC GGGAAAAGTT TCTTATCACA 
CX3TC6GACAC ATGACROTGC ACAGCTASTT ClAAAGAAAG GTTTTGGTGG AACTQCAGGA 
ATGeCAMTG -rGGGAACSWyr GTGTTC&AGG AGCCACQCAG GCGC3GATTAA TGTGTTTQGA 
CAAATCACTC? TOTAQACATT TGCTTCCATT GTTGCTCATG AATTGQGTCA TAATCTTQQA 
ATGAATCACiG ATCATGGGAG AGATTGTTCC TOTGOnaCAA A0AGCTGCAT CATGAATTCA 
QTTCCSUSAAA CmTAGCAGT TGGAGTGCAG AG8ACTTTGA <3AA23TTAAC7 
TTARATAAAG GAGGAAACT6 CCTTCTTAAT ATTCCTAAGC CTGATGARGC CTATAGTGCT 
OCCTC!CTGTG GTAATAAGTT GOTOOAOGCT GGGGAftSAGT GTGACTGTGG TACTCC3UVA3 
GAATQTQAAT TGGAOCCTTG CTGGGAAGGA AGTAOCTGTA AQCTTAAATC An-TGCXGAG 
TGTGCATATG CTGA CTOTT q TAAABACTGT C3GGTTOCTTC CAGGAGGTAC TTTATGCCQA 
CSGAAAAACCA GTGIVGTGTGA TGTTOCAQAO TACIGCAATQ OTTCTTCICA GTXCjTGTCAG 
OCA8ATGTTT TTATTCAlQAA TQGATATOCT TGGCAGAATA ACAAABCCTA TTaCTAOAC 
GGCATGIGCC AGTATTATGA TGCTCAATQT CAAG^CATCT TTOGCTGZU^ AGCCAAGGCT 
GCCCCCAAAQ ATTGTTTCAT IGAJVGTGAAT TCTAftAC3GTO AC3iflAITTQS CAATTGTGGT 
TTCTCTG6GA ATGAATACAA GAftOTGrGCX:; ACTDGGGAATG CTTTGTGTGG AAAGCITCAQ 
TaVGUM3nAltG TACftAGAGAT AOCTGTATn^ GSAATTOIQC CTGCTAT^T TCSU^ACGCCT 
AGT0SV5GCA CCAAATOTTO 0OQTGTGGA7 TTOCAGCTA6 GATCAGATGT TCCAfiATCCS 
GGGATGGVTA AGGAAGGCAC AAAATOTGOT (3CTQGAAA6A TCTGTAGAAA CTTCCAGTGT 
GTRGATGCTT CTOTTCTaRA TTATGACTGT GATGITCAGA AAAAGTOTCA TQGACATGOG 
GTATGTAATA GCAATAAGAA TTGTCACTQT GAAAATQGCT GGGCax:OCCC AAATTGTGAG 
ACIMM3GKr ACGGaVSGAAfi TBTGGaOUST GGACCTACAT ACAATGMAT GAATACXGCA 
TTGAGGGAiCG GACTTCTGGT CTTCX'TCTTC CTAAnGTTC OCTCTTATTOT CT6T6CTATT 
TTTATCTTCA TCAAOAOGGA TCAACTCTGG AGAAGCTACT TCAGAA3W3AA GAOATCACaVA 
ACftTATGAOr CAGATGGCAA AAATCAT^OCA AACCCTTCTA GACAGCCGGG GAGTGTTCCT 
OGIVCaSCTTT CTCCASTQftC ACCrCCX:!AGA GAAGTTGCTA TATATOCftAA CAGATITQCA 
GIACCAACCT ATGCAGOCAA GCAACCTCAf? CAOTTOaCAT CAAGGCCAGC TCCACCACAA 
GOGAAAGTAT CATCTCAGGB AAfcCTl^AATT CCTGGGGGTC CTGCTCCTGC ACCTCXZTTTA 
TATAQTTCCC TCACTIGATT TTTTTAAOCT TCTTTTTOCA AATGrCTTCA GGGAACTGAG 
CTAATACTTT TTTTTTTTCT TOATOTTTTC TmARAAQCC TTTCTOTTGC AACTATigAAT 
Q2\hAACAAAA CACCACAAAA CAGACTTCAC TAACACAGAA AAACAGAAAC TGAGTGTGAG 
AGTTGIBAAA TACAIUSGAAA TQCAGXIU^ GCAGGGAATT TACRATAACA TTTCO gTTTC 
OVTCATTGUUL TAASZCTTAT TCAGTCAT06 GT9AGGTTAA TOCACTAATC ATGGATTTTT 
TGAAGATGTT ATTGCAGTGA TTCTCAAATT AACTCTATT6 GTGTAAGATT TTTGTCATTA 
AUraTTTAAO tgttattctg aattttctac cttagttatc attaatotag ttcctcattg 

AACATGTGAT AATCTAATAC CTQ1X3AAAAC TGACCAATCA GCT6CCAATA ATATCTAATA 
TTTTTCaVTCA IGCiACGAAXT AATAATOm: ATACTCTAOA ATCTTOTCIG TCACXCAdA 
CATGAATAAO CAA&TATTOT CTTCAAAASA ATGO^CAASA AOCACAArEA AGATQTCATA 
TIMTCTTGAA AGTO^CAAAAT AIACTAAAAG A<JrGTOTt3Ta TftTTCACGCA GrTACTOGCT 
TOCaWTTXTTA TQAOCTTTQA ACTATA0GTA ATAACTCTTA GAQAAATTAA TTTAATATTA 
GAATITCTAT TAT6AATCAT GTGAAAOCAT OACATTCXSTT CACAATMaCA CTATITTAAA 
TOAATXAXAA QCTTTAAGQT ACGAAGXATT TAAIAGATCT AATCAAATAT GTTGATTTCAT 
GGCTATAATA AAGCAGG9U3C AATEATAAAA TCTICAAlCA A*rTGZ\ACITT TACAAAACCA 
CITC3ACUIATT TCA^TQAQCAC TTTAAAATCT GAACTTTCAA AGCTTGCTAT TARATCATTT 
AGAATGTTTA CATTTACTAA GGTGTGCIGG GTC^TSTAAA ATATTAGAC7L CTAATATITT 
CATAOMATT AOQCrGQAOA AAQUVSGAAG AAATGGTm CTTAAATACC TACAAAAAAG 
XTACTSTGGT ATCTATGAGT TKITCATCTTA G/CSHOCGfUXML AAAVGAAXTT TTACTATGGC 
AOATATGGTA T6I3VI08TAA AATTTTAAOC ACTIUUAATT TTTTCATAAC CXTTCSVTAAT 
A7AGTTTAAT AATAGGTTTA TTAACXQAA3* TTCATIA&TT TTTTAAAAGX GTTTTTOgTT 
TerGTATATA TACATATACA AATACAACAT TTACAATAAA TAAAATACTT GAAATTCTCA 
AAAAAAAAAA AAAAAAAAAA AAAAA 

5e^ ZD HO I 45D Protein sequanoe 
Protein Aoccselon #: NF 0Q3 907.1 



1 ^^ 
1 1 

pygiDQlVSTVI QABGKEHIXH 
EGVHireSIAI. SDCFGIiRGM. 



LLA1RU1SH7 
AQLVKKEDQFG 
SDCSOGAKSC 
I.VI2AGEECDC 
DVFEYCKtJSS 
IEVH5KGDRF 
W3VDFQIiG0D 
KCECENGWAP 

KQPQQPPSRP 



GXAGHAPVer 



GTPKBCBLDP 
OFGQPDVFIQ 



VFDPGHVNEG 
PHCETKQYGG 
KRSQTYESDG 
PPFQPKVSSQ 



21 
I 

GLVGFVLGAA 

t.'ROMtmT.T.ma 
RLENASYGIH 
XiRRRRAVliPQ 

glbhttngnji 
vcbr5has8i 

MF6BCSAEDF 
OCEOSICKLK 
NGYPOQNmCA 
KKCATWKUC 
TKCGAGKICR 
SVDSGPTOIK 
XNQARPgRQP 
GIsmiPABPAP 



31 
I 

RPGIPQQTSm^ 
DFWYTYNKB 
PXiQMSeHFKH 

XiaXVGGAGDV 
NVFGQITVRT 
EKLTLNKGGN 
fiPABCAYGDC 
yCXNGMCQYY 
GKLQCENVQE 
MFQCVDASVlt 
MI7rALRDQI»E» 
G8VPRHVSPV 
AFPUYSSUr 



4X 
I 

GTLITDHPNT 
riYRNDESVYK 
DKEHYDHM6R 
EASHFVOHHBK 
FABXVAHEEG 
CIjLMIPKSDE 
CKDGRFtiPGG 
DAQGQV1FG6 
IFVFGXVPAX 
IfYDCDVQKKC 
VFFPLIVPLI 
TPFREOTXYA 



51 



240 
300 
360 
420 
480 
540 
600 
€60 
"720 
780 
340 
900 
560 
1020 
X080 
1140 
1200 
12ti0 
1320 
13B0 
1440 
ISDO 
1560 
1620 
16 BO 
1740 
IBOO 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
34B0 
3540 
3600 
3660 
3720 
37B0 
3840 



QNHCBYRCJYV 
KPLKCGVSITK 
NQTAVKEEHI 
FLITRBRHD8 



AYBAPfiOGHK 
TTiCROKTaEC 
XAKAAPKDCF 
IQTP3RQTKC 



VCAXFZFIKR 
JUHFAVPTlfAA 



60 
120 
IBO 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
7flD 
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Seq ID NOs 4S1 DMA segnence 

Nucleic Add AccesEion #: HM_016650.1 

Coding sequence: 196 -.789 ^ 

5 1 11 21 31 41 51 

1 ' ' ' » 

GGTTTCAA.XA. TAT0CAGATG TCTOGATATA GGAATOAAAT TACQTCTTTa GAACAACTTA 60 

AATAA6TCAA ATATACrTTOG AGCTTTAAAA ATTAAAAGGA GRGRGMI'TCB ABCACCTTTT 120 

^ ^ CTGCTGCCAT GACAACCATG CAfiGGAATGG AACAGQCCAT GCCAGOGTT6 GCCTQSTGTa 180 

lU CCCCAGCTGG GAAACATOGC TQTCATACAT TCACASCTGT GGAAAOQATT GCAACAGAA6 24D 

TTCTTOAAGG GAGAACCCAA AGlTCCTTGGtd BTTGTQCAGA TTCTGACTGC CCTOATGAGC 300 

CTTAGCATG6 GAATAACAAT GATGTGTATS GCS^TCTAATA CTTATG6AAG TAACCCTATT 360 

I'COSTQCATA TCGGGTACAC AATTTC3G6GQ TCAGTAATGT TTAITCAtTTTC AGGATCCTTG 420 

TCAATTGCAG CAGQAATTAG AACTACAAAA GGCCTGGTCX: (?Ai3GTAGTCT AGGAATGAAT 480 

ATCAJCCAitSCT CXGTACT<3GC TQCATQUSGO ATCTTAATCA ACACAIITAS CrTOGCGTTT 540 

TATTCATTCC ATCACCCTTA CT6TAACTAC TATOeOACT CAAATAATTG TCATGGGACT 600 

ATGTCC3VTCT TAATGGGTCT GGATGOCATQ qTGCTOCTCT TAAGTGlGCT QQAATTCTGC 660 

ATTGCTGTGT CCCTCTCTQC CTTTGGATGT AAA6T6CTCT OTTGTACCCC TG6TGGGGTT 720 

GTGTTAATTC TGCCATCACA TTCTCRCATQ GCRGAAACAG CATCTCCCaC ACXZACTTAAT 780 

GRGGTTTGAG GCCAACAAAA GATCAAOUSA CAAATOCTCC AGAAATCTAT GCTGRCTGTG 640 

ACAOWGAGC CTCAGATGAfi AAATTACCAQ TATOCAACIT CGATACTGAT A£?AOGTGTTG 90 D 

ATATTATISIT TATATGTAAT CCAATTATQA ACTGaOTGTQ TATAGAGA6A TAATAAATTC 960 

AAAATTATGT TCTCATTTTT TTCCCTGOAA CTCAATAACT CAGXTCACTS GCTCTTTATC 1020 
GAGAGTACTA G8AQTTA2AT TAATAAATAA TGCAartTAAT Q]b9aCX»uCAG GAAAAA 



15 
20 
25 



45 



55 



60 



65 



Seq XD NO; 452 Protein gequ<gic|e 
Protaln AcceBaion #: HP 057734.1 



1 11 ai 31 41 51 

HAVIHSBLHK GLQSKPURBS FECVIiGWQIL TALMSL6H3I TMHCHA^MTy 6Si9PI8VHIG fiO 

YTXWOSVHFI ISGfiLSIAAG IRTTKSI^VRa SLGMIflTS&V lAAfiOZLISIT PSIaAJFTSFHH 120 

FragyyGH&Ei hcbqihsziiH gi:jx3MVLLI>s vlbfciav3i> safgckstloc tpgcwItIlp ibo 

SHSHHABIAS PTPUIEV 

Seq ID HQs 453 DMA aeqtience 

NUclelc Acid AccesQlOD. #; MH„002091.1 

Coding 9eq]uence» 56.. 503 

40 1 11 21 31 41 5L 

I I I I t I 

AGTCrCIGCT CTTCOCAGCC TCTCOGGOGC CCTCOaGGG CTXGCOGTOa GQACCATGOG 60 

CGGCMTTGAG CTCCO0CTOQ TCCTGCTGGC GCTGQTCCTC TGCCXAGOGC CCCOGGGQCQ 120 

AGCJOQTCOOS CTGSCCTGCX3G GCXaGAGOOAC OGTGCTGRCC AAGATGTAfX: CX3CGCGGCAA 180 

CCACTGCSGOa QTOGGGCACT TAATGOGGAA AAA0AGCACA GGGGAGTCTT CTTCTGTTTC 240 

TOAGAGAGGG AGCCTGAAOC AGCAOCTOA6 AGA6IACATC AQGTGlGaAAO AAGCIGCAAG 300 

GAATTTGCTQ GCTTCTCATAS AAGCAAAGGA GAACAQAAAC ChCCASOCAC Cl!CAAl3GC3A 36Q 

GGCCTTGGGC AATCftGGAGC CrTCGTGOtSA TTCAGftGtSAT AfiCAOCAACT' TCAARGATGT 420 

_ AGGTTCAAAA GQCAAAGTTG GTAGACTCXC TGCTCC3«3OT TCTCAACGTG AASGAAQSAA ABO 

•>U GOCOCAGCIG AACCAGCAAT GUUCAATOATQ GOCTCTCTCA AAAGAGAAAA ACAAAACCOC 540 

TAAOa OACT G AGmsnsCAA C CATC BGITG TAOOGKICAT CA2U3UGhTT TCCTTGIGCA 600 

AAArATTTQA CTATFCTGXA !E(l"l"i."iXATCC TTBACZAAAT TCGSGKVi'£T CAA0CAGCAT 660 

CTTCTGGTTT AAACTTBTTT GCTCTGAACA ArXGrCOAAA AQAOTCrVOC AATTRATGGT 720 

TTTTTA^ATC TAGGCTAGCT GTTQaTTAiaA TTCAAGOCCC OGAjSCIGTirA CCATTOUM 780 
TAAAAGCTTA AACACAT 



Seq XD SOi 454 Protgin geguence 
Protein. Accession #: 1IP_002082.1 

1 11 21 31 41 SI 

11)111 

MROSELSLVL XiAUVUZiAPR GRAVPLPAGG GTVLtlCHirPR GStaKKV^SIM 6KKSTGBSSS 60 
VSERGaXiKDD UtEYIRNBEA ABNUdSLlEA XBNIUEIQPPQ PKALGNC»PS HDSBDSSHFK 120 
SWQSKGKVGR Ii&APG&QRBB SHFQUIQQ 

Seq IS NOs 455 DMA aeguence 

KOcleic Acid Accession #s ZIH_Q16522.l 

C!bdi2)g sequences 2 65.. 1399 " 

70 1 11 21 31 41 51 

I 1 I 1 I I 

GCGQAAGCAC OGAGQAaOGA GCCCCCTTTG GC0GTC3CTCC <?rOGAACOGG TTTTGOQAGG 60 
CIGGCAAAAG COQAGGCTOG AUTCGGGGGA GGAATATTAG ACTCSGnGOA GTCTGCGGGC 120 

o:tttctcctc cccggqcctc coggtdoccg ogggttcacc gctcagtocc cQcaCTCGcr lao 

/j COGCACCCCA OCXACTTOCT GiGCTGGCGC GGGQGGOGIG lGOCGTGC3aa CIGCOGGAOT 240 

TOGGGQAAGT TGTGGCTQTC GAGAAIGGGG GTCtOTGaGT AOCTGTTOCT GCCXIIGOAAG 300 

TGOCTO6T0G tOGTGTCTCT QtfSGClGCTG TfCCTTGTAC CX^ACACGAGT GCCXXSTGOOC 360 

AQCGGAQATO C3CAOCTTCSCSC CAAAGCTATG GACAACGTOA CGGTCOGGCA GGGGGRGAGC 420 

^ GCCACCXTTCA GGIGGACTAT TGACAACOOG GTCA0C0GG6 TGGOCTGGCT AAACOSCAGC 4 BO 

oU ACCATOCTCT ATGCTGGGAA TQACAAGTQQ TGCCTG6ATC CTOGOaTQGr OLTiVJ-taAQC 540 

AftCB COOA AA OGCAGXACAG CATGGAGATC CAGAACOTGa ATGTGTATGA OQIUBGGCXCX 600 

"IftCAOCltSCT GGGrrGCAG^ AOACAACCAC OCAAAGAOCT CxaGOIXXCCA OCICATTQrG 660 

CAlVGTATCrC GCAAAATrTGr AGAGATTTCT TCAGATATCT CX3VTIAATGA AGGBAACAAT 720 

ATTAGCCrCA CCIGCATASC AACXGOXAGA. CCAGAGCCTA 06GTXACTIG GAQACACATC 780 
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TCTCCCAAAG 
CGGGAACAGT 
CX9GAGAGTAA 

TTCCR6TGGT 

TACACTTGOGr 
CCA6GGGCC6 
CTGCXTCTTC 
CGOGAAA0GC 
OCAATCAGRT 

CCITGCA6AT 

GGCTCAGCCT 
AAATTCRATC 
OCGGGOCAAG 



CQGTTGGCTT 
CAGGGGAC7A 
AfiGTCACCGT 
(3ACAAAAGGG 
ACAAGGATGA 
TCCTCTCftAA 
TGGCCTOCAA 
TCAGCGAGGT 

TGOCGCCACX: 
ATATACAAAT 
CAAAGAAIAC 
AZTXAQGXAC 
AOOCACTGCA 
CTCTQCCCAC 
AGTCCATAGA 
COTOGCSCTG 
ATAAAAAGAG 



TGTGAGTGAA 
06A0TGCAGT 
GAACTATCCA 
GACACTGCAG 
CAAATiGACTG 
ACTCATCTTC 
CAAGCTGGGC 
QAGCAACGQC 
eCTGCTTCTC 
ACCACCACCA 
GAAATTAGAA 
TTTGGG GGOA 
AATGGAfiTTT 
AGCTGCATCCJ 
AGACTGCCCC 
GAOGAACAQA 
CGGGCACTTT 
CAAAAAAAAA 



GACGAATACT 
QCCTCCAATG 
CCATACATTT 
TGTQAAiOCCT 
ATTGAAGGAA 
TTCAATQTCT 
CACACCAATG 
ACGTC6AGGA 
AAATmSAT 
ACAGAACAGC 
GAAACRCAGC 
AAAGAGTITT 
TCTTTTOOCA 
TOCAACCrCT 
CACGTGGAAC 
ATQAQACCTT 
GGTAGACTGT 
AJUUUUAAA 



TtSGAAATTCA 
ACSTTGOOCGC 
CAGAAGCCAA 
CAGCAGTCOC 
AGAAAGGGGT 
CTGAACATGA 
CO^GCATCAT 
GGGCAGGCTO 

araasTGCCA 

AATGGCAACA 
CTCATQGORC 
AAAAAASAAA 
AAOGGGAABA 
TTOGTGC GAG 
ATTCTGQAC3C 

QCCACCACGG 



GGGCATCACX: 
GCCCGTGGTA 
GGGTACAG(?7 
CTCAGCAGAA 
(?AAAGTGGAA 
CTATGGGAAC 
OCTAT'ITQOT 

CTTCCCCACC 
COGACAQCAA 
AGAAATTTGA 
TTGAAAATTG 
AC!AC»J6CACA 
TGTGGGCAAG 
TGGCCATCCC 
CGTGGCGCTT 



840 
900 
960 
1020 
1D80 
1140 
1200 
1260 
1320 
13 BO 
1440 
1500 
1S60 
1G20 

ifiao 

1740 
IBOO 



Seq ID itOt 456 Protein eeguence 
Protein Accession #s HP_O57£0g.l 

1 11 21 31 41 51 

I I I I I 1 

HCSVCGYLFLP WKCLWVSLR LLFIfVPTGVP VRSGDATPPK AMDNVTVRQG ESATLRCTID 
KRVTRVAWIiN RSTHjYACaro ICWCMPRWIi LSHTQTQYSI EIONVDVYHE GFYTCSVQTD 
NHFRTSRVHIj rVQVSPKrVB ISSDISINEG IIDIISIiTCIAT grpeptvtwh hispkavgfv 
BEDEYIiBIQG ITKBQSQDXB CSASNDVAAP WRRVKVTVN YFPyiSEAKI} TOTPVOQIDST 
liOCBASAVPS AEFQWYKDIDK SLIEGSCEQQVK VERRSFLSKIi tFtStVSSEBDY GHYTCVASKK 
LGHTNASIPCL FOPOAVSEVS NGTGRRA6CV HLLPIiLVIjBEi ZiIiKF 

Seq ID NO: 457 DMA Bequence 

Hueleiti Acid AcceBsioD. #: VM_01Z261.£ 

Oodlng sequence t ao3..ia45 



GATrPGCrCT 
ACAOAhHTAGG 
CACrCCAGOG 
OCTCATTCOG 
ACTTC3GAGTT 



TQGGAOQAOG 
□GCSCASCAAC 
TGAGGTGAA6 



11 
I 

GCCAQCA0CT 
CGCTCCCTCC 
GOSAiC^TTGA 
OGCACTGOGA 
CTCCTQATGT 
TCAGGOCIIT 
TGTCTCATOS 
TACBTAi^ATC 
GGCCGCTGTG 
CICAAAATGC 
AlSGCTOAGCA 




CACXXXXX3CT 
OrGATCOSCAG 
TATCTCAGAT 
GOMUSAAAGC 
OGCXBVTTTAC 
Al^GOCAGTAT 
CCAACTQQAir 
CKZAGCTACA 



GGGAAGTCCT 
AA6ACGGTCA 
TTTGTCTTCA 
TTGCGGCTGA 
CAiOBTCSCAjCC 
AAlGCACAT^SG 
CAGGTAGAAC 
ATCAAACAGG 



GGOATTCCCT 
GTATGGATCT 
TQTTCCAXAC 
OCACTAAGOC 

c^AQzusTi^rec 

TGATCACAGA 
aCCACAQCOV 
TCTTT6TAAA 
AAOTGCAGTr 
GGAAGCACAC 
ATOIUSTGTCA 
CCATGATCCT 
OTGAAiGAQCA 



AChAAKTGhC 

GCHAiGAiaaCC 
AACAAAAiGCA 
CCTGQQTATC 



ATGCTGGG^ 
TGRCTCTCCA 
TTG»AAACAT 
TGCTCCCTT6 
TCATGCTCCC 
GTTTAGTGAX 
AAAAOGACTA 
GGGGOACCTO 
TTCTCTGGC 



AAGAGCAATA 
GCTTCTTTGA 
GACACASCTG 



TGXCTTGGGA 
AlGTAACTAV 
AAGAATCAAT 



AATGCCACTP 
QGAGGAAACC 
GCTTATCCTA 
OCCiCTGAAAG 
ATGTTTCACr 



CTG'XOTGnsr 



31 
I 

GGCTCGACAC 
TCTOTOOCCC 
CTCTGGCGGC 
CCAAGGAAQA 
AAIGGCTCAA 
TOAAAAaOAT 
AGOCAAAXTT 
ACAGGOOQAT 
fiXCOGAGCTG 
OGAAAGCCAC 
XGICTAGGAC 
AGCCAACTOS 
ACCrCAACAA 
OTCTGGGQTC 
TAAATGOOCA 
CATCTTOaSC 
TGOCAACCaO 
OTTAGGCAGG 
CTTTTOCATC 
TQAGQCrrGC 
TTGTAGGGXG 
ACAGCTTTOG 
G(?AC3CTOTAT 
CCTTTAGGTT 
TACZ«3TTG1C 
TGATTCATGC 
GCBVCCGGCA 
TTQQACTTCT 
CTGTTTTTCA 



41 SI 

I 1 

OGAGTCCTAQ CTAGClQGCl'C 
6GCTCTCGCT CAOOOOOGCC 
CTCTGCAGCA GtSVCAGCCGG 
GGGG7CCOCA GCATCX3ACAG 
ATCAT8GCAG AAC»AGAAG*r 
A*rA!i:XT6TGG TGCX3GGAAAA 
ATTGTAOCTT ATGATGTCJKs 
ATOOCATTGA CGCGGGGAGC 
GAAfiTGTTCT GGQTQGATCQ 
AACATGTGCA AGGGAOCTGA 
TOCTGGGAGA AAACQCACIT 
<3\iGCACCICX GIGOCTTGGT 
ACCATTTCAC TOaOCTCTAG 
CACATCCAAC CTTTTGACAT 
GTGGAIGAGC GGGAGCAACT 
CrCGTCATCA TGGTAACACT 
GTGCA6AT0C €M!OGGGACAG 
CAOCXXCTAT TCCTGCTCCC 
TTCjrACACGA GATACACCAA 
7TGSCIIGTG TOCATGCITA 
AAATOGCAAT EATTXACTOC 
TGCrCATGGT GGCITGGCTT 
CTGGCCfCCAA AGTTTAGSGA 
CAGAAGRATA TGGOGTGCT!r 
AATGCACACA GAATACAAOC 
TTCIGGCTOG CMTZCCGCAr 
TOCAGGGACT GCAGCItdCCAS 
TCCTGTGCGA GGXCCAAGTC 
AAATGAAATA 



60 
120 
180 
240 
300 



60 
120 
IBO 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
7»0 
840 
900 
960 
1020 

loao 

1140 
1200 
1260 
1320 
13B0 
1440 
1500 
1S60 
1620 
1680 
1740 



Seq ID NOs 45B Protein fleoucpGe 
Protein Accession #s IIP_036393.l 

1 11 21 31 «1 51 

I I I I 1 I 

MEXLQGRGVPS IDRlr1lVLLHI> TETMAQTMAE QBVE!iiLSGlf& THFEKDIFW RENGTTCLKA 
EFAAKFtVPY DVHASNYVDIi ZTEQADIALT RGAEVKGRiCG HSOSELQfVFH VDSAYAIi^KML 
FVKBSHNMSK GPSATWKLSK VQFVYDSSEK THFKDAVfiAG KHTANSHHIiS ALVTPAGKSY 
BCX^AQQ!TISL ASSDPQKTVT MIL8AVSIQP EDIISDFVFS EEBKCFVDER BQLEBTELPIiX 
USLlUSmK VTLMlEHVHa KMTANQVQIP BDRSQYKHMG 

Seq LD NO; 459 EffilA sequence 

filUcleic Acid Accession tts 1IIK_001169,1 

fv><^-irtg sequence: a5..B70 



GO 
120 
IBO 
240 



11 



21 



31 



51 

1134 



wo 03/042661 



PCT/US02/36810 



I I ) 1 i I 

TAGOAOATAA. GAGTATCTTQ CACAGCAGGT GCAGGTTTCC CAGCAGCTCA GGCAAGAQTC 60 

oaaTGTTTGT GCCWTCrGAT CCTGATGTCT CX33U3AGATiU3 CCATGTGreA GOCTGAftTOT 120 

GGCAATGACI^ AGGCCAGGGA GCCaAGC<3TG GGTOGCM36T GQCXSAGTGTC CTGGTACGAA IBO 

5 aSGTTTGTGC AGCSCATGTCT GGTOGAACTG CTGOGCTCTG CTCTCTTCAT CTrCATCSGOa 240 

TGCdGTCGG TCRTTGftGAA TGGOflXIGGAC ACTGGC5CTGC TOCAGCXXSGC CCTGGCCCAC 300 

GOGCTOGCTT TQ£3GQCTCGT GATTGCCA06 CTOGOGAATA TCAGTGGTGQ ACACTTCAAC 360 

CCrGGGOTGT CCCrGGCAGC CAT6CXGATC GQ»5aCCTCA ACCTGaTQAT ISCTOCTCCOB 420 

TACTGGGTCT CACAGCTGCT CXSGGGGGATG CTCGGGGCTG CCTTGGCCAA GGTGGTGAGT 4BQ 

10 CX!TQAQaaQA GGTTCTOSAA TGCATCTGOG GOGGCXITTTG TGACAGTCCT- GGAGCAOGtM S40 

CAOGTGGC2VG GGGCGTTGGT GGCAOASATC ATCCTGAOSA CGCTfiCTGCSC CCTGGCTGTA 600 

TGCATG6CJTQ CCR.TCAATGA GAAGACAAAG GGCCCTCTQG CCCCGTTCTC CATCGGCTTT 660 

GCCGTCACOQ TGGATATCCT G0CIGGQG6C CCIGTGTCT6 GAGGCIGCAT GAAXCCCGOC '720 

CGTGCTTTTG OACCTQCGCSr OGTGGOCAAC CACTGOAACT TCCACTGGAT CTACTOaCTO 7B0 

15 GGGOCRCTOC TGGCTOGCCT GCTTGTTGGA CTGCTCATTA GGTQCTTCA'X TGGAGATGOG 840 

AAGACCCaCC TCATCCTGAA GGCTOGGXGA QCAQAGCTCG TGGGATTCCT aCTGCTCCAQ 900 

GXGXCCrCAG CTCAOCTOTC GCftjaACTtSAG GACAQGGGAG TTCCTaCATT TQCTGCCAiGG 960 

GCAQAOQCXX: AGAiecSAGCGA CCOOCTGCTT OCACTGCTTG GGCCTGCnT CTCAOATAQA 102D 

CIGACTGCTG AGGAGGCTCT AGQTTCTTGG AATTCCTTTS TGCTCATCM A6AOCCCAGC 1080 

20 CTCGGGAACA COCTaCCCGC ACTGCX3CASA GAGCAGTGCA AACAOCACAA CAOGAGCGTG 1140 

TTTCTTGAGA GGAATGTCCC CGAaTTQQAC AAGGAGGCTG TTTCTGC3VCA TCAGCTCATT 1200 

TOCOGCACKC CATTTCTTGC TTGATTGCTT TGTTGQGOGC CTGGCCACTT CCTTQCTTCT 1260 
CAAC3C3GACA ATTCTTCIACTT TGCAATAAAT AGTGCA6TGT TTCCTTCKT 



25 



9eq ID KO; 460 Protelp oequenge 
Protein Accesaion #: HlB 001160.1 



I 11 21 31 41 51 

30 I I 1 1 I I 

MSOBIAMCBP BFG£IDKAREP SVGGRDfKVSH YERFVQPCLV ELEiGSALFXF I<3CIj9VIENG 60 

TDTGLI42FAL AHGLALGliVl ATUaSilS&Sa FXIPAVSIiAAM XiIGOLNI>in4l» lUFYWVSQIsLG 120 

GBOiGAALAlCV VSPfiBRFWHA SGMFVTVQE QGQVASALVA filXZATUAIi AVCHGnLlNEEC 180 

TlQGFLAPPSI GFAVTVDZIiA GGFVSGGCaill PARABGPAW ANHHSEHSriY WLGPLIAGLL 240 
35 VGDTiIRCFIG CGKTRlilUCA R 



nucleic Acid Accession #£ HM_003226.l 
40 sequence: 2.. 226 

1 11 21 31 41 51 

I I 1 i I I 

GATOCTGGSG CTGGTCCTGG C3CTTQCTGTC CTOGAGCTCT OCTQAOGAGT AC3GTGGGCCT 60 

45 GTCTGCAAAC CAGTQTGCOG TOCOGGGCAA GGACAGGGTO QACTGC3GGCT ACCCCCATGT 120 

aUXXX:CAAG GfiGTGCAACA ADOWSQaCTa CTGCITTGAC TOCAC3GATC3C CIGGAGTGCC IBO 

TTGGTOTTEC AAQCCCCTGA CXASGAAGAC AGAATGCACC TTCTGAGGCA CCTCCAGCIG 240 

CCCCXGGGIVr GCAGGCTGAB dCCCTTOCSC GGGCTGTCAT TGCTTOCAOQ CACtGlTGAT 300 

CXCW3TTTTT CTGTCS3CrTT GCTCCGGGCA ASCrTTCTOC TGI^AnSITCA TATCTGGM9C 360 
50 CTGATGTCTT AACGAATAAA GGTCCCATGC ITCCACOOCS 

Seq ID NO: 462 Protein seqaence 
Pxotein Acceseioa llip^0032.17.l 

55 1 11 21 31 41 51 

I I I I 1 I 

MLQDVliftLLS S6GABEYV6I» SANQCAVPAK DRVDOGYCHV TPKBC39NBGC CFDSRIPGVP 60 

HCFKPUXRKT ECTP 

60 Seq ZD ^1 463 DHA Baquence 

Nfuclelc Acid Acceooicm #i )IM_002993.1 
Oodijag sequence; €4.«40B 

^^1 11 21 31 41 51 

65 I I ] I ) I 

gqc3w:gagcc agtctcosco cctcxaccca gcxcaggaac ooscgaaccc TCTCTTGACC 60 

ACTATOAGCC TCCOGTCCAG OGGGGCX3GOC OGTOTCCCEa 6TCCTTCGGG CTCCTTGTOC 120 

GQQCTGGTCG GGCXGCVSCT CCIGCTGMCS OCGCOSGGGC CCCTOQCCAa OQCxeCXCCT IBO 

GTCTCTGCTG TOCXGAJCM3A GClGCOTTGC ACTTGTTTAC GC3G3TAOQCr GAGASXAA^ 240 

70 CCCAAAACGA TTGGTAAACT GCAGGTGTTC CCC0CRSCSCC CXSCAGTSCTC CAAGGTOGAA 300 

GTGGTAGOCT CCCTOAAGAA OBOGAAGCAA GTTTGTCTGG ACCdQGAAGC CCCTTTTCTA 360 

AAQAAAOTCA TCCM3AAAAT 7TTGGACA0T GGAAACAAGA AAAACTGAGT AACAAAAAAQ 420 

ACCATGCATC ATAAAATTGC CCAGTCTXCA GGGGAGGRGT TTTCTGGAGA TCCCTGGAOC 480 

OISTAAGAAT AAGAAQGAAG GGTTGGTTTT TTTQC3UTTTT CTACATGGAT TCXTTTACTTT 540 

75 GAAGAGTGTG ©SGOAAAGCX: TACSCTTCTC CCIGAASXTT ACROCTCAGC TAATGAAGTA 600 

CTAATATAGT A1TTC!CACTA TTTACTGTTA TTTTACCTQA TAAGTTATTG AACCXZTTTQG 660 

CAATTGAC:x:A TATTGTGAQC AAAQAATCAC TGGTTATTAG TCITTCAAIG AATATTGAAT 720 

TGAAfiATAAC TATTGTATTT CTATCATACA TTGCTXAAAG TCITAGOGAA AAGGCT6TGG 780 

ATTTCGTATQ GAAATAATGT TTTATTAGTG TGCTGTTGaDS GOAGGTATCC TGTTGTTCTT 840 

80 ACTCaCTCTT CTCATAAAAT AGGAAATATT TTACTTCTGT TTTCTTGGOG AATATGXTAC 900 

TCTTTAOCCT AOGATGCTAT TTAAOTrGTA CTGTATTAGA ACACTGOGlG TGTCATACOG 960 

TIATCTSTGC AGAATATATT TCCTTATTCA GAAITTCTAA AAATT7AAGT TCTGTAAGGG 1020 

CXAAIATATT CrCT!CCCTAT OGTTTTCASAT GTTTOATaTC TTCITAGTAT GGCATAATQT 1080 

CATOATTOIAC TCATXAAACT TTGATTTTGT ATGCKATTTT TTCACIATAO SATGACXATA 1140 

1135 



wo 03/042661 



PCT/IJS02/36810 
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ATTCTGGTCA 
TGATT13CTAA 
AATQATCrGT 
CATTTAGTCC 
TTTAflafiGTT 
AAATTGCACT 



CTAAATATAC 
TTTACATAGA 
GCTCTGCAAA. 
TCAAAATATA 
TTGACCATTT 
TTTATTTTTT 
ATAAAA6A.TT 



ACTTTAGATA 

AATaTAT:rcT 

GTTTTGAAAA 
TACA0CATTG 
TGTTAT6AGG 
CCrOTQTGTC 
TCTAAACCAA 



GATQAAOAAG 
CTXGGTTTTT 
TATATTTGAA 
CIAAGATTTT 
AATTATACAT 
ATGTTGGTTT 
AAAAAAAAAA 



CCC3UiAAACA 
TAAATAAAAG 
CAATTTQAftT 
CAGATATCTA 
GTATCAOVrr 
TTGGTACTTG 
AAAAAAA 



GATAAATTCC 
CAAAATIAAC 
ATAAATTCAT 
TTGTG6ATCT 
CACTATATTA 
T^VTTQTCATT 



5eq ID NO: 464 Protein sgguence 
Protein Accession. #s np_002984.:l 

1 11 31 31 41 51 

\ \ \ \ \ \ 

MSLP8SRAAR VPGPSGSIjCA IjIALLLIiLTP PGFIASAGFV SAVLTBIiHCT CIjKVTLRWP 
KTIGKLQVFP AQPQCSKVEV VASI^KKGKQV CLDPEAFFIiK KVZQKIXiD&G NKKN 

Seq ID KOi 4fi9 DMA sequence 

BTUCleiC Acid AccegBlon #s £IMj0O2O39'2 

Coding sequences 106.. 500 



GAAOCGTTTA 

crTcrcTOCT 

OGOTATCQCT 
AGAAAAAGTG 
TGGCCOTCGG 
GCftTOGCGGC 
GCGGCGftGCC 
QCOTOTTCAT 
6TGAGGAOGA 
TCTTOCW3TX 
GTTCTCACTA 
ACTGCAGCCT 
rACAAISCAIG 
CCIAGATOTQ 



11 
1 

CTCGCTGCIG 
CCAAOTTCTA 

CTCGGAf3Af5C 
ASGAGGACTC 
GMCTCGGT6 



A6GTAATATT 
T0AG(3AQTAa 
AOGATCIAGA 
TATTGTCC&O 
CCAACXOCTA 

gqcx:gac(3at 

AAAACAGAAT 



21 
I 

TGCCCATCXA 
GTGACOGAGC 
TACCroCTGC 
TOGGACASCO 
OCAOTCSGCOG 
GCIGCCTCBC 
CTA6TGGCCA 
GGrGCCXTIQA 
GCAGCAGCTC 
ACTTTGCCTT 
GCTAGftSIGC 
GCCTCAAOTIsr 
GCCCAQAKTC 
AAACITCACC 



31 
I 

TC3UGK3U3SCT 
CCGQBCtSOQG 
TCTTCACTTa 
GCTCCGGGTT 
GGCTGCCCGC 
TGAltsAGCTG 
OGCTGGACSIUS 
TGGGCTACGC 

ccaaAAccTC 

TTrrTTTTTT 
AGTGOCXnT? 
ATGCTCCIGT 
CT0AACTTT6 
CAGAAAA 



41 
I 

CCOOGCIGAA 
OSCCACX!ATrO 
CA<3T(3GGGTG 
CTGGAAGOCX: 
GCIGGGCTXC 

CACCCACAAG 
TTCTTCCTTC 



CftCAGATGOS 
CTCAACCTCC 
TCTATCACTC 



Seq ID NOi ASe Protein 



1200 
1260 
1320 
13flO 
1440 
15O0 



51 
I 

GATTGCTTCT 
OSBCAtSAftSG 
GAGGCAGGTA 
CTGJlCCTTCA 
AOC3E3G03CO& 
CTGAATOGGG 
GGTOtSCAGCA 
TATCTOGATA 
TTGGCXrXAAC 
TTTGAGATG6 
AACATAOmC 
CAAGTAGGAT 
TCCCCAAC3A 



Protein ItoaessicD np_002029,3 

1 11 21 31 41 51 

1 I I I I I 

MSQKAVSLFIi CrhlMLVTCaa VEAGKKKCSE SSDSQSGFnK ALTFHAV6GG [lAVASLEAliG 
PTGAGIAAXVS VAASIMSHSA IL»6GGVPAO GLVAf LQSLG AOGSSWIQK lOHLMGYATH 



60 
120 
IBO 
24Q 
300 
3fi0 
420 
4B0 
540 
fiOD 
6€0 
720 
780 



5eq XD HO: 467 DMA seqaence 

nucleic Acid Accessloa #s im_p03469.2 

'^^■■'^g sequence 7 92. .1945 

1 11 21 31 41 51 

I 1 I I I I 

GAAAGGGGCC GAlQEAAISCrCS COCGGAGAAC GOaQAfiSAAX ATOCIGTOQA QCXCCTCTGC 
CATATAAACA AAAAGAGGM ATCTTTCAAA CATGGCTGAA GCAAAGACGC ACTGGCITQQ 
AGCACCCCTO TCPCTTATCC CTTTAATTTT CCTCATXTTCT GOGQCTQAAG CAGCTTCATr 
TCAOnOAAAC CAGCTGCTTC Ai3AAAf^AACC ASACCTCAOa T^TOGAAAATG TCCAAAAOTT 
TCCCA6IOCT GAAATGATCA GGGCTTTGSA OTACATAGAA AAOCIC0(3AC AACAAGCICA 
TAMaGAAl3AA AGC2U3CCCA6 ATTATAATCX: CIAlCXMSaV GTCICIGTOC OCCTTCAOCA 
AAAAGIAAAT QOGGATGAAA GCGACTTGCC OGAGAGGGAT TCncnsaara AAGAAiCaiCIlG 
GA7GAGAATA ATACTOOAAa CXTTGAGIUCA OGCTQAAAAT GftGCCTCAGT CTOCACXZAAA 
AGAARATAACa CCCrATGOCT TGftATTCAGA AAAGAACTTT CCAATGGACA TGAGTGATGA 
TTATGAGACA CAOCAGTGGC CAGAAAGAAA OCTXAAGCAC ATGCAATTOC CTGCTAaGTA 
TEGAAOaGAAX TCCAGGGKTA ACCCCIXTAA AGGCACAAKT GAAAXAGXGG A0GAACftATA 
TACTOCTCAA AGCCITGCTA CATTGGAATC TGTCTTGCAA GAGCTGOGaA AACTG2WCIV3G 
ACCAAACAAC CAGAAACGTO AOAQOftTGGA TGAGGAOCAA A2\ACTTTATA CSGATQATGA 
AGATGATATC lACAAGGCTA ATAACZm^C C£A<rGAAGAr GTGGTCGGGG GAGAAGACTG 
GAACCCAGTA QAOGAGAAAA TAGAGAGTCA AACCCAOOAA GAGGT6AGA6 ACM3CAAAGA 
GAATATAGGA AAAAATGAAC AAATCAAOOA TGAGATGAAA aSCTCAGGGC AGCrEGGCAT 
CC3U0GAAGAA <aTCTT06GA AAGAOAGTAA AG2U3CAACTC TC3UQ2erGArR3 TCTGCAAaGT 
AATTGCCTAT TMAAAASQT rEAGTAAATGC TOCAQQARGT GGGAGGTTAC AQAATGGGCR 
AAATGGOGAA AGGGCCACCA GGCTTTTTGA GAAACCTCTT QATTCTCAGT CTATTTATCA 
GCTGATTGAA ATCTC2\AGGA ATTTACASAT AOXQCAGAA GACTTAATTG AGATGCICAA 
AACieGGGAG AAGCOB&ATG GATCASTGGA ACOGG2V30QQ GAGCXIGACC TTCCTGTTOA 
0CIA<3A,T0AC ATCTCnOAGO CTGACTTM3A CCATCC3VSAC CTGTTCSCAAA ATAGGATGCT 
CTOCrAAsShST GGCEAOCCTA AAACACCTGG TOGTGCTGOQ ACTGAGGCOC TACCAGACGG 
GCTCAGTGTT GAOGATATTT TTkAATCXTTT AGGGATGGRG AGTGCAGCAA ATCAGAATVAC 
GICXSTAXtTT COCAATOCAT ATAACCAG6A GAAAGTTCS^O eCAAGGCTCC CTTATC3QTGC 
TGSAAGATCT AGATCGAACC ASCTTCOCM. ABCTGOCTGG ATTCCACAIPQ TTGAAAACA6 
ACAGKIGGCA TKIBAAAACC TGAAGQACAA GSATCAAGAA TTAGGTGAGT ACTTGOCCAG 
GATQCTAGTr AAATAOOCTG AGA3CATTAA TTC^AACCAA GTGAAGOaAG TTCCTGGTCA 
AiOGCTCATCT GAAGATGACC TOCRQGftAGA GQAACfiAATT GAQCAGGCCA TCAAAifiAGCA 
TTTOAATCftA QQC3W3CTCTC AGGAGACTGA C^AAGCTQQCC CXKGXGnOCA AAAGGTTCCC 
TGTGGGS0C3C OCGAAGAAOG A3GAT2WCCXX: AAASAGGCAO TACTGGOAIG AAGAXClGIT 
AAIGAAAOIG CTGOAATACC TCAATWUSA AAAGGCAGAA AAGGQAAGSG AGCATATTGC 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
€00 
660 
720 
7 SO 
840 
900 
9«D 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1€B0 
1740 
1800 
1860 
1930 



1136 



wo 03/042661 



PCT/US02/36810 



TAAGAGAGCA ATGGAAAATA TGTAAGCTOC TTTCATTAAT lACCCTACTT TCATTCCTCC 1980 

CAOCCCAACC AAATCCCAAC ATTTCTCTTC AQTGTQTTQR CTTCTATCCT GTTAACACTG 2040 

TAASATCTTT AAATGATGTA CAG6CAGATQ AAAOCAGGTC ACTGGOOAST CTGCTTCATT 210O 

TCCTCTGROC TGITATCTIG TGTATOGATA TOTOTAAATO TTATGawCTGC TT6ATAAAAA 2160 

5 ATTTATTATG TCCATTATTC AA6AAAGATA TCTATGACTG TGTTTAATAQ TATATCTAAT 2220 

GGCTQTQC3CZA TTGTrTGATGC TCRCATATGA TAAAAAAGTG TCCTATAAn? CTATTGAAAG 2280 

TTTTTAATAT TTATTQAATT ATTTTGTTAC TGTCTGTAGC GTTTTGTQGA GTACTGQACC 2340 
AAAAAAATAA AGCATTATAA ATATA 

10 Sea ID NO; 46B Protein aeguence 
Protein Acczeeslon ft; MP_003460..1 

X 11 21 31 41 51 

\ I 1 1 1 1 

15 MAEAKTHWDO AALSLlPLlF LISGAKRAOT QRNQLLQK5P DLHI.EIJVQKP PSPEHIRALE €0 

YIENLKQQAH KEESSPDYNP YQGVQVPIjQQ KENQDBSiiliP EOiDSLSBEDH KRIILBAI/RQ 120 

AEHEPQSAPK SGfKFYAliK&£ IQSPPMDKSDD YKTQQHFERK LKHMQFPPHY SmiSRIUIPFK IBO 

RT23EIVEEQ7 TFQSZAT&BS VFQELQKXM SHSiQKSBEiHD BSQKLyiDCffi DDIYKAHHXA 240 

YECWVGOEEIH NPVEBKISSQ TQBEVRDSKB NIGKNGQIHD EHRRSGQjDGI gSBDIARBSK 300 

20 DQLSDDVfiKV lAYLKRLVNA AGSGRJjQHGQ NQERATRIiPB KPXJJSQSIYQ I^lBieSNLQI 360 

PPEDLIEMLK TQEKPNGSVE PERKLDLPVD liDDI&BADIiD HPDLFQNIUIL SKSQYPKTPa 42 Q 

HAGTEALPDG LSVKOIX^LIi GMESAANQKT SYFENPYNQE KVLPfiliPYGA GR6XtSNQI»PK 480 

AAHIPKVEEIR QHAYESLUDK DQBLGBYliAa MLVKYPEIIN SNOVKRVSOQ GSSEDDLOBB S40 

BQIEQAXKEH rorQGSSQBTD KZiAFVSKKFP VQPPKBID&TP ElEtOYHDEDMi MKVLEyLHOB 600 

25 KASKC3R&H1A KKAMSHM 

Seq ID HO 3 469 DNA gequence 
nucleic Acid Acoeesiou #s MM_00639B.l 
2^ Coding sequences 19.. 516 

1 11 21 31 41 SI 

1 I I I I I 

GGCOOCTTGT CTGCAGftGAT GGCTCCCAAT aCTTCCTGC3C *rCtGlGTGCA T6T0C3G1:TCC 60 

GAGGAATOQQ ATTTAAXGAC CTTTGATGCC AACCCATATG ACAGCGl^AA AAAAATCAAA 120 

35 GAACATGTCC GGTCTAAGAC CAAGGTTCCT GTGCAGQACC AOSTTCTTTT GCTGGGCTCC ISO 

AAQATCTTAA AGCCS^CJGGAG AAGGCTCTCA TCTTATGGCA TTQRCAAAGA GAAGACCATC 240 

CACCTTAOCC TGAAAGTGGT QAAacCCAdT OAXY^ZUGGAGC TGCCCTTGIT TCTTGTGGAG 300 

TCIUSGTOATa AQ6CAAAGA6 GCACCTCCTC CAG6TGGGAA GGTCCAGCTC AOTOOCACaA 360 

GTGAAAGCAA 7GATGGA6AC TAAQAaSGOT ATAATCCCie A8ACCCAGAT TGT6ACTTGC 420 

40 AATBGAAAOA CACTGGAAGA 1!GGGAAGATG ATGGCAGATT AOGGGATCAG AAAGGGCAAC 4 BO 

TTAdCrrCC TGGCATCTTA TTGTATTQQA GQGTOACCAC CCTGGGGATG GQGTGTTQGC S40 

AGGGGTCAAA AAOCTTAXXT CTTTTAAXCI CTTACTCAAC GAACACATCT TCTGATGATT 600 

TOQCIAAAATr AATGJU»ATG AGAT GftGTAO AGTAA0ATTT QOaTSOGAIG GGXAl86AlX3A 660 

AGTATATTGC (XAACTCTAT GrTTTCTTTGA TTCTAACACA ATKAATTAAG TGftCATGATT 720 

45 OTrrACTAATG TATTACTGAG ACTAGTAAAT AAATTTTTAA GGCAAAASraS AGCATTC 

&«q ID MO: 470 Protein secmence 
PTDtelzi Acceeslon HP_0Q6389.1 

50 1 11 21 31 41 51 

) ■ 1 1 1 I I 

HAFHASCLCSr BVBSSEMDLH :CFIlAHPYDSV XKZKEBVRSK TKVPVQPQVIi LIiOSKXIilCPR 60 

RSLSSYGXI»e ESTIHIiTI«lCV VKPSDEBLPIt FZiVEfiGDBAK RBLIdQVRRSS SVZVQVKKHXS 120 
TICTGXZFSTC) IVTQlSKSZiE DSKHMACVGI "RKGISILLEtAS YCXG9 



55 



80 



Seq ID HOs 471 IOTA getmence 
NUcIeic A<7^d AcceseiiOKi ft: xk_o 94741.1 
Oodlxig sequences 1,.94B 



60 a 11 21 31 41 51 

1 I I i I ) 

ATGAASGOCA ACTACAGCGC ACSAGOAGCGC TTTCTCCIGC TGOGTITCTC CQACTGaCCT 60 

TOCCTGChlGC C3SGTCCTCIT OGCCCTTGTC dCCIGTOCT AOClCCXGAC CITGAOGOOC 130 

AAC T OOqOBC TOGTBCIGCT GOCGGlGCXSC GACOOGCGGC TGCA^CXaCC CATGTIkCTACr 180 

05 TTCCTCaCGCC ACCTGGCCTT GGTA6AOGOG GGCTTCACTA CTAOCGTGGT GCOGC06CT6 240 

CTG6CCAACC TG030(5C3ACC AJSOCSCTCTGG CTGCCGOGCA GCCACTGCAC GaCCCAOCTO 300 

TGOGCATOGC TGGCXCEOGG TTOGGGOSAA TGCOTCCTCC TGGCGGTGAT GGCTCTGGAC 360 

CacaCBiaCCa CAGTGTQCX^S 0CC(3CIGC9GC TKIGGGGGGC TOQICTCXIQC GOGCCTAtEGT 420 

aXhCGGTGG CChiGOGCCTC CTGGCTAAOC GGCCTCACiCA ACTOSGTTGC OCAAACOGCO 480 

70 CTCCTGGCTG ftGOGGCCGCT GTGCGOGOCC CGOCXGCTG3 ACJCACTTCM CTGTGAeCtG 540 

CXXSGCSTTGC TCSIAGCTGGG CTGCOQAGGC GACGGAGACA CTACCGAQAA CCAGATCHTC 600 

GCOGOCOGCG TGOTCATCCT GCTGCIGCCG TTTGCOGTCR TCCTGGCCIC CTAOGGTGCC 660 

GIGGCCGQAG CT6TCXGTTG CATGCGQTTC AGCGGAGGOC GGAGGAlGGaC GGTGGGCACSS 720 

_ TGTGGGTCCC ACCTGAiCAGC OOTCTGCCTG TTCTAOSGCT CQQCXMCTA CACCTACCIG 780 

75 CAGCCGGGGC AGCGCTACAA CCAGGCAOSa OGOUVfiTTCG TAIC6CTCTT CTACACOGTB 840 

GTCAC3U3CTG CTCTCAACCC GCTCATCTAC ACOCTCAGGA ATAAtJAAAGT GAASGOGGCA 900 
GGGAGOAGGC TOCIOCGGAG TCTGGGGAGA GGCCftlOaCTO GGCAGTGA 



seq ID NOs 472 Protein Bcguence 
P]Coteln Accession II i XP_Od4741.1 

1 ^11 21 31 41 51 

t t I I 1 I 

HKAMYSABBR FMiLCFSDHP SLQFVIiFAI^V ZiLCYIiIiTEiTG HSALVIilAVB DPRIiHTPHYY 60 

1137 
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FLCHliAIiVDA. IiANLKGPALW LPRSKCTAQL CASUU^SftE CVI>IiAV|1MJ> 

RAAA.VCRPIJI YAGLV£PBLC RTLASASHItS BEiTHSVAiQirA Z2«AERPLCAP RUJmFICEL 
PAIalrKLACQQ DGDTTEHaKF AA&wrliLLP FAVTLASYGA VARKVGCHRF 806RRRAVGT 
CGSHLTAVCL FYGSAIYTYli QPAQRYHOiat GKFVSliF^nv VTFALNPblY TLSNKKVKGft 

ARRULHSIiOR 6QAGQ 

Seq ID NO: 473 DWA gequence 

BlUcleic Acid AcceBBicm #s zm__OOX062 .1 

Coding eegoencQ s 76 13 80 



120 
160 
240 
300 



1 
I 

GCTCTCA.TTA 
TACACT3TTG 
TCTTTTATTC 
CTAAAACXnx: 
AATGTTGTGT 
ATCCAACAAA 
GCCTTGATTA 

TACjraccToa 

CACAArCGGCA 
CTGTTPIATG 
AACrATTATT 

ACx:nxy[aTGA 

AACATCAGTA 
GGTCTCATTG 
GACTATTATA 
AllTXdCAAG 
GGAAAGACCT 
AACATCTCXS3 
GTCAATTA^ 



ACAATOGnOQ 
AATAATGACA 
aCTrAQTTAOB 
GCCCAAACTT 
TTATQCCTTC 
TCTCTACATG 



11 

I 

CCTTCTGCCC 
GAGAGATQAG 
CAAGCCAACT 
TQTTGAATAC 
TGTOCCTCftA 
TCAAATACAA 
TACTGGCTTT 
CTQACAAGCT 
CrCCCCTGAC 
GGAACTACTC 
TTQGTAGCCA 
AiSAASAQTCT 
TTTATACAAA 
GAAACACA-rr 
ATGAAAATQA 
G/iGCATTGAG 
TCITGOATAT 
CTGATGAGCC 
CTQTGAGAAX 
TCAOTGTGAT 
AQGOCTCATB 
GAACCTACTS 
TTQTCCQCAA 
TCCTCAGCTG 

TTCAATAAAA 



21 
1 

ATCACTTAAT 
ACRGTCACAC 
ATGCGA(3LA.rT 
AATGATCCAG 
ACTTGTTQaA 
TfiTSAAAAGC 
GGGAGTATGT 
AGAAAATAAA 
TAACTACTAC 
AACCGGCGAA 
GTTCTCAGTA 
AATAAATGGG 
QTCACTEGGTA 
TAGCACAGGA 
CTGGAA7TGC 
TAATCCAAAG 
OIAACAAAGAC 
TA7AACTGTG 
CAA'IGAAACA 
GGAOAAAIGCC 
GGGGOCCTAT 
GGAACTTCTB 
arGGAGAAAAC 
CATAAAATCC 
ATCCCAGTAC 
6TTGTTGAAA 



31 
I 

AAATAdCXlAG 
CAGCTGCCCX; 
TQTQAQGTAA 
TCAAACTATA 
ATCCAGATCC 
AGATTGTCAG 
CQTAACGCTG 
TTCCAAGCAG 
CAOCTCAGCC 
GTTGTCAACC 
GATACTGGTG 
CAGATCAAAG 
GAAAAGATTC 
GAAQCCATGC 
CAACAAACTC 

OCTQcnaccc 

TCTTCTTGCG 
ACACCTCCIG 
TATTTCACCft. 
CAGAAAATGA 
ATC/VOCTGTA 
AOTGOAGGOS 
TTGGAGGTTC 
ATTTGCAGTG 
GASCAGGAGA 
^TTAAC 



41 
I 

CCAATTCATC 
TAGTGGGaCT 
GTGAAGAAAA 
ACAGGGOARC 
AAACCCTGAT 
ATGTAAGCTC 
AQGAAAACTT 
AAATTGAAAA 
TGGACGTTTT 
ACTTCACTCXZ 
CAATGGCTGT 
CAGATOAAGQ 
TGTCT6AGAA 
AOaCCCTCTT 
TGAATACAGT 
AGGTCXTACC 

ACTCACAATC 

atgtcactgt 
atgatactat 

XTCRGGOCCT 
AACCACTGAG 
GCTGGAGCAA 
GAGTTCCATG 
GTTAATAACC 



51 
I 

AACATTCTC3G 
CrTACirGTTl 
CT7M:ATCC5GC 
CAGCGCTGTC 
GCAAAAGATG 
GGQAGAGCTT 
AATATATGAT 
TATGGAAQCA 
GGCCTT6TGT 
TGAAAATAAA 
OCTGGCTCTG 
CAGTTXAAAG 
AAAAGAAAAT 
TGTATCATCA 
GCTCACaOAA 
TGOCCTGATG 
AG6TAACITC 
ATATATCTCC 
QCTRAATGGT 
ATTTQGTTTC 
ATGTGCCAAC 
CCAA6GAGCT 
ATACTAATAA 
TTTATTGTCC 
TCCCCTTCTC 



Seq ID HO: 474 Protein aegnence 
Pzotein AcceBsloEi #k np_0 010 53.1 



X 

\ 

HRQ8HQL.7I.V 
l.KliVGIQ2QX 

TPSTGBSHDA 
DINKD88CV8 
V14EKAQXMHD 
RNGEHLEVSW 



11 
I 

GLLJLF9F1P3 
LMQKMIQQIK 
EHHEIAHMGTP 
AVIALTCVBO: 

ASQJEKISAD 
TIPOFTHEEK 
BKY 



21 
I 

QLCEICSVSS 
YPIVKERI>aDV 
r/FNYYOljaLD 

EPITVTPBDS 
GWGP2XTCIQ 



31 
I 

BHYIKIiKPLL 
SSGFIAIiIIIi 
VUOXILFSIGN 
EGSLKHigiY 
WIiTEI&QGA 
QSVISVNYBT 



41 
I 

NTMIQSSnOIR 
ALQVC3RNABE 
YSTABWNHP 
TKSLVEKILS 
FSNPNAAAQV 
HJQjBTVPXNV 
YNBLIaSOOBP 



Seq 3D WO: 475 DWA aecjuepce 

imeleic Acid Accession #: 19HJ0O 4852.1 

Coding seqiueiicei B9..1546 '~ 



1 
I 

<3CCTGACCAA 
CACTITLT G CAC 
OGGOGGCGGG 
GOOCOQCGGC 
GCTGGGCACG 
GGCCrOGATC 
CATSAGCATG 
GCTGACAOOG 
TCACCOGCAC 
OSl(3U3C3G(3C 
CTACAGTGCC 

GCOQCEoaac 

TCOGCGGGGC 
GftXX:CX3CGQT 
GCACCIGAAC 



11 
1 

GGCCCQGGCC 
ASAOCTAGAA 
GG0C3CGCGCG 
GGCOOSGOCC 
COGCXSTGGCT 
GCOQCAQCOG 
CTGGAGGGOG 
TCCTGCaftCX 
CTCCAGOOGC 
CACCATCCXSC 
AGCTTCACCC 



21 
I 

ClQA3GGftCI 



GOC3GC3QGCAG 
ATGAGCTUSQA 
08CTGCXSGG6 
GQQCAGOBGC 



31 
1 

GAATGAAS6C 
GAACCGGGAG 
TGGCXSOGGGC 
GCTGCT^SCC 
COCTCOGCOa 
GGOGTOGOGC 



GSrClCQGCC 
TGOCACCCAT 
ACCAOCACCA 
TCKIiSOGOGA 



GGAAGAAATC 
TATCX:CXX3W3 
CCTGCTCOGG 
GA^ECTGGAAG 
GTGGAAAGGC 
CGXGQTaTTC 
CCGGTCAAAG 
CAGClkACTTC 



AAOaOGCTAG 
CAOGACAAAA 
GAGCAACACC 
GGGCTGC3U3C 
CGGCCACCCX 
AACAOCAAAG 
GOSATCTTTG 
AATCCAAAAC 
TGGCTTCAGG 
AAAG2U3CRA6 

ACT^^ccrcc 

GAGATGCAGA 
TTCATGAACG 



GOGGCCXCCa^ 
TGCTCftGCOC 
TGICCCGOOG 
ACCOGGGCCA 
C&XCCTCATC 
AGGTGGCOCA 
GGCAGAGGGT 
CGTCTAGTAA 
AGCCOGAGTT 
AACCAAAC3Ulk 
AAOSGCSGAAC 
TCAGOITTTC 
CCC3GGCGC06 



!IGGChTGGGC 
CTCCftCOQTG 
CCSiCCACCAC 
OGAGOGOGGG 
GAGCSCAGIUBC 
CAAGGOGCAG 
C3iACTTCX5AC 
C5CXGGGCftCC 
CACTCAfalCT 



AGOCCCAGGC 
CCITCCAACCG 
TCGGOCAIGS 

rtccATCxxxx: 

ATGftSCAACA 

TCTOAjCAAGX' 
CACC2^GGCC 
CTCOOGGCCA 
CXGTOOCSCSGC 
CAGACTTCVSC 
GOGCACC3UCA 
CC3UXTG0GG 
CAOGGGOOQ6 



€0 
120 
180 
240 
300 
360 
420 

4ao 

540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
ISOO 



SI 
I 

GTSAVlilWIiS 
lirtilYDmLID 

TPEHinnnrpQ 

BKKEHGLIGIT 

XiPAiiMaKTni 
TVIISI6SVFL8 
XiSQQAaSVW. 



41 SI 

I I 

TGCCTACACC GOCTATOGArr 
CXGACAATG6 AAASTCTGQG 



GCX9CATC3UA 
GCTGTGOCGG 
ACTCAAATCT 
OCAGOGCATG 
AiSACAGGAAC 
ACTCITCSOC 
CCAGCAGCT6 
CAGCCIGGAB 



GGGGAGCTGA 
TCTCAGGGGA 
GGCAOGGAGA 
arOCGCXTTTAC 
AATTCOCAGA 

AiCTrCAiuao 

GGCCTGGAGC 



eo 

120 
IBO 
240 
300 
360 
420 



OCC7VCCA0GC 
CGCACC2VSSA 
TCftOCAGCAT 
T6CAOCAOSC 
OCTACACCAC 
TCX!ACCACOC 
T6TC0GGCAA 
TOAACAACCT 
TOGOCOGCAC 
CCAACTAOGG 
CTGCGATGCr 
CCATOATGTC 
TGCC6GCACC 
COGSCSCAGCT 
AGGGCTACAG 
CTCTCTCCGA 
OCTTCCGCAG 
GCCIGGCAGC 
ASAAOTOCC3G 
AGAACAAAOG 
TCACAAjCCGT 
AOGATCTQAQ 



60 
120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 

10 ao 

1140 
12 QO 
1260 
1320 
1380 
1440 
1500 



1138 



wo 03/042661 



PCT/US02/36810 
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15 
20 
25 
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40 



CACAGGGGGC TCCTCGTCCA CCTCCAGCAC OTOTACCAAA GCATGA3X3GA AQOACTCTCA. 
CTTGGGCA<2A. A6TCACCTCC AAATGAGSAC AAOtfSATACC AA?\AGAAAAC AAAGGftAftAA 

GACRC3CGGAT TCCTAGCTGG GGCCCTTCAC TGQTG 

3e<i ID NO; 476 Protein seqaeoce 
Profcein Accession tti NP_004B43.1 

1 11 21 31 41 51 

I I I ! I I 

MSPEIiTMESL GltiHaARGGG SGGGGGGGGQ QGI3C3<3PGHEQ EIiLASPSf HH ARRGPRGSLR 
GPPPPPTARQ ELGTAAAAAA AA^SAMVTS MA8ZLDGGDY BfGLSlCLBH AK5H5CDSSF 
FGM6M&HTVT TI/FPLQPLPP XSTVSDIKFHE PHPBHHFHHH HBRBHQRLSG »V8G8FTLM& 
DHROI^AMNfiJ LYSPYKEMPG MSQSLSPIJUl TFLCaRSLOaL HNAQQSLFHY GPFQHDKMIiS 
PNEX>AEHTAM LTRGB^^HLSK GUSTPFAAW SHUIGLHHPG HTQSEGPVItA PSRERPPSSS 
S6SQVATSOQ LBBINTKEVA QRITAfiLKKY SIPQAIFAQR VIjCRSQGTIiS DUiRSnPKPWS 
KLKSGRfiTPR KMHKHLQBPB FORMSAIAIA. ACKRKEQEPN KDRNHSQKKS RLVPTDfLQRR 
^FAIFKEHK RPSKEMQITl eQQLGIkELTT V&HPPNBTARR KSLBKHQnDZ* STGC^SSSTSS 
TCTKA 

Beq ID UDi 477 DKA sequence 

HUclelc Acid Aocesaion #t HM_013271.1 

Coding sequence I 27.. 809 



iseo 
1620 



1 

I 

TCOCSGAGCGA 
CCGGGOGGOT 
TCTGCGOGCG 
AGACXGGOaC 
TGCAGGAGCT 
GQQCGG2U3GC 
TCTGGGGCGC 
CTGCnCCGCA 
CCCAGCTTGT 
ACGACGGCCC 
CCQM3CTGTT 
TGOCAiGGOOC 
CTGACGGCGT 
TGCCTGCACQ 
CSOAAGTGCC 
TXACCCSCaC3C 
OATCTOAGC 



11 
I 

GGCTOQCTSQ 
CQGOCTTTTG 
GCC5GGTAAAG 

TOCToaccac 

OGOGOGGGCG 
QCAQQAOacr 
CCOCCGCAAC 
GCTCQCTTCGC 
CCCCGOGOCC 

GAGGIACTTG 
GCGCCBCCTC 
6CFG66GG06 
CCQCCTCTT*G 
CC06GCATCC 
CA(X3C»3CCC 



GTGCTGCTGC 
GAACOOCJQOS 
TTCOGGOGOT 
CIGGGGCATC 
QAGOATCAGC 
TCIQATCGGG 
GCrCTGCTCC 
GTOCCCGCOG 
GATGCTGAGG 
CSX3GGAOGGA 



CTGCrrGOGTG 
CCACCCTGAG 
GGCCACCavSG 
TCTCAOCOGA 



31 

TGCTOGGCCr 
GCCTAAQCQC 
CA6TGCCCCG 
TGCTGGAGGC 
AGGCGOQOGT 
CTCTGOGCCT 
GOGCCCGCCT 
OGGOGCIOCG 
AOGCAGGOGA 
ITCTi'GCGGG 
OOSACCAOGA. 
TGAAAiCKCCT 
CACTGCCCGG 
ACTTCTCCCC 
GGATOCCTAC 



41 
L 

GTTTCGGCGG 

AQCOTcrrciaj 

AGGTGAGGOG 

CCTGGOGCAG 
G6ACBACGAC 
TGAOCCTOOC 
ACCCO^CCC 
CGA6ACAC0C 



A(3Aj3ACC9CC6 
ATCCCGTGCA 
GCCAQCAOaT 
CCOCIGGOCG 



51 
L 

CCCCOOeCGC 
CCCTTGGCTG 
GGGOGOGOGG 
aAaCQQQCSQC 
CIGCTGCX306 
COCGAOGOGC 
GCCCTTAGCAS 
COGGTCTACQ 
6AOGTGGAOC 
TCQQAOGGGO 
G»SCT6GGC3C 
OCOCCCCAGG 
CCCTGGGACC 
CCAQA6CAAC 



60 
120 
180 
240 
300 
360 
430 
460 



60 
120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 



45 
50 
55 
60 
65 
70 
75 
80 



9eq ID NO; 478 Protein sequence 
Protein Accession #$ 1IP_037403.1 

1 11 21 31 41 51 

111)11 
MAG8PI.LW3P RAGOVaUiVIi LLLGLFRPPP ALCARPVKEP RGLSAA&PPL ABIGAPRAFR 
RSVEIUaEAAG AVQELARAIiA BLIiEABRQER ASABAQBABD ^KZaKVUIQLIi SVHGAPBNSD 
PALGL0DD6D APAAQIiARAL liSARLDPAAb AAQLVPAPVP AAALRFRPXV YDDGPAGBDA 
fiBASDETPZnr DPELLRYIiLG RIIihG&ADSE GVAAPRfiliRR AADHEFVGSBIt PPJBJVKiGAXJi 
RVKRLETPAP QVPARRIiItPP 

Seq IP NOs 479 laiA sequence 
KUdeic Acid AccesBlon 41: imj002214 
Cbding aeqtiestcex 6Bl*.2990 ** 



1 
I 

CTGC9CGACIT 
GTTGGCCTCC 
TCCCCTGGAC 
TAGGSTOOTT 
CTAAGCTGAT 



TGGCCSTCGA 
GOCCQTAGGG 
COGAGCOSOG 
GaCCC2CQA6Q 



TCXGCCTGCA 
CACTTGTTCT 

GcrGraccAG 

^TCAGSTSG 
GCTC3U3TT(sA 
TTAAXACCCA 
ATlTTATGCr 
ATQXCTC&0C 
CTAGAAAAAT 
AAAOUSTTTC 
ACAATTTAGA 



11 



CTGCOCAOCT 
CTCGCCGOOQ 
TCOCCCCCAG 
XTATGCAGCA 
CAGGCTOCGG 
AGGAGGTGCr 
GOCCTQAGAT 
GGGVGGQCCT 
TOOCCOGGGA 
GXm6CA!IT 
AAACGACOGG 
TGGACTGGGC 



ATCAAGAAGT 
TTCAATAGAA 

GGTGRjCAGCA 
gaaac3ttcat 
ATCAAIGCAC 

ggcatttftc 

ACCATACATT 
CTGCATQCCr 
TQaCAAAiSCA 



21 
I 

TTGCIQOCAT 
<5CrGCTCC36C 
GTGGAAGCAA 
TAOCCTCCCA 
CTTOGGGCTT 
GAAOCXXXAC 
AGCCCTTGCA 
TCTQGC3GGAG 
GCCGAGGGGT 
GCTAGGCCTG 
OGCOGZbSCOC 
ATGTGCGGCr 
OQAGGTCCX33 
CAAGGTGAA3 
CIGGGTOC*AG 
GMU3GTTGTG 
TACCC2VTCTG 
GGAGAAGTGT 
GCTCTGAAGA 
AATAATATAG 
TOOOGTGACr 
AGCATCCnCC 
CCCCATGGAT 
GTTCATAlBAC 



31 
I 

GCCGAGCTTC 
AGAOGGGGCr 
CTGCGCTGAT 
CAGATCCAGC 
TGTTTGGGTX 
CSGGCTGOAGA 
GAGCCCTCTC 
ACC6CGGGAC 
GOCCQGGCOe 
OGGAAAROGT 
GGGTOCQQAA 
C33GCCSCTGGC 
CCTCGTTCCT 
ACMTAGATG 
AATGTQOATG 
ATATTGTTTC 
T6CRIGTTAT 
CTATCCAGCT 
AATATCCTGr 
AAAAATTAAA 
TTCGTCXTOS 
CGGAAAGGAT 
ACATGCAT6T 
AGAAGATCTC 



41 

I 

CTCCCTTGQC 
GCAAAGCIGC 
TGATGGGCCA 
ATCRCOTIOT 
TGATIGTGn? 
GAAACAAAAG 
¥CCAG!IX3GCIC 

GCTTACC3X?C 
CCXAGOGACA 

Gocawas ftag 

TTTTTTTACC 
CTGGGCAGCC 
TGCATCTTCA 
OTGTOTTCAA 
CAI^kinAAIA 
AATACCC3U3 
GOGTCCAGGA 
GGATC^XAT 
TTCCGTTGGA 
ATTTOGCXCA 
TCATAATCMV 
GCTGTCTTTG 
TOGAAACATA 



SI 
] 

AGCCAGGAOG 
AACThATOGX 
CAQACm^T 
GAAIGTACAT 



GC3CQaGCCCT 
GAGOCX3GGAG 
ACCGCTTOCT 
CT0QCCCGC3G 



GCTGCKSmO 

AATGCAGCAT 
GAGGAXTTCA 
AGCAAAGGCT 
OAAAATQAAA 
GCOQAAGCm 
TOWCCXTG^TG 
AAOGATTTAT 
XACBTTGATA 
TGChGTGAiCr 
ACAORiQAACA 



60 
120 
180 
240 



60 
120 
ISO 
240 
300 
360 
4Z0 
480 
54 0 
600 
660 
720 
780 
84D 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



1139 



wo 03/042661 
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AAGGA(3GTTT TGACQCXiaVIG CTTCAGGCAG CTGTCTGT0A AA6TCATATC GGRTGGCGAA 1500 

AAGAOGCTAA AAGATTGCTQ CTGGTGATGA CAOJITC^^GAC GTCTCATCTC GCTCTTGATA IS 60 

□CAAIOrTGCSC AOGCATAGTQ GT6CCCAATG AOQQAAACTG IfCATCTGAAA AAGJUlCOTCT 1620 

AOCSTCZUATC GACAACCATQ GAACACCCCT CACIAOQGCA ACTTTCAGaiG AAATTAATAG 16 BO 

5 ACJWjCAACai: TAATGTCATC TTTGCAGTTC AACSGAAAACA ATTTCATTGG TATAAOGATC 1740 

TTCTACCCCT CTTGCCAGGC ACCArTGCTG GTGRAATAGA ATCAAAf?QCT GCAAACCTCA 1800 

ATAATTTOGT AGTGGAAGCC TATCAGAAGC TCATTTCRGA AGTGAAAGTT CAGGTOGAAA 1860 

ACCAGGTACA AOGCATCIAT TTTAACATTA GOaCCATCXG TGCAGATGGG TCCA6AAAQC 1920 

a«5GCftTGC5R AHQATGCftGft AA0GTGA06A GCAATGATOA AOTTCETTTC AATGIftACRG 19B0 

10 TTACAATGAA AAAATGTGAT GTCRCAGOAG GAAAAARCIA TGCAATAATC AAACCTATTG 2040 

GTTTTAATGA AACCGCTAAA ATTCATATAG ACAGAAAC7G CAGCTGTCAa TGTGAGGACR. 210 0 

ACAGAGGAGC TAAAGGAAAG TGTGTAGATQ AAACITI^CT AGAITCCAAG TGTTTCCAlGT 2160 

GTGATGASAA TAAAT6TCAT TTTGATOAAG ATCAGTTTTC TTCrOAiGAGT TGCAAGTCAC 2220 

ACAAGGATCIl GCCT6TTTGC AGTGGTOaAa QA0TTT6TGT TIGTGGGAAA TGTTGATGTC 2280 

15 ACAA&ATTAA aCTTGOAAAA GTGTATGGAA AATACTGTGA AAAGGRTGAC TTTTCTTGTC 2340 

CATATCAOCA TGGAAATCTG TGTGCTGOQC ATGQAGAGTG TGAAGCAGGC A6ATG0CAAT 2400 

GCTTCAGTGS CTOGGAAGGT GATCGATCOC ASTGCCCTTC AGCAGCAGCC CAQCACTGTG 2460 

TCAATTCAftA GGGCCAACTG TGCAOTGGAA OAGGGACSTG TGTG1X3TGGA JU3GT6TSAGT 2520 

GCACCBATCXr OUSGAGOVIC GGCOSCTTCI GTGAACACTG CCXX^ACCTQT TAaTAO^GCCT 2580 

20 6CAAGGAAAA CIGGAATTGr ATGCAATGCC TTCACCCTCa CSUVrTTGTCT CAGGCTATAC 2640 

TTGATCAGTG CAAAACCTCA TQTGCTCTCA ItSGAACAACA GCATTATGTC GAOCAAACTT 2700 

CA8AATGTTT CTCCAGCCCA AGCTACTTOA QAATATTTTT CATCATTTTC ArAGTTACAT 2760 

TCTTGATTOa QTTGCTTAAA GTCCTGATGA TTAGACAG6T GATACTAGIA TGQAATAGCA 2B20 

ATAAAATTAA GTOCTCATCA GATTACACSU3 TOTCAGCCXC AAAAAAGGA.T AaSTTGATTC 2880 

25 TGCAA3^TGT TTGCACAAGA GCAGTCAOCT ACOGACGTGA GAAGCCTGAA GAAAXAAAAA 2940 

TGGAXAXCAO CAAATTAAAT GCTCATQAAA CTTTCBG6TG CAACITCTAA AAAAAGATTT 3000 

TTAAACACrr AA^CGGGAAAC TGGAATTGTT AATAATTGCT CCTAAAGATT ATAATTTTAA 3060 

AAGTCACAGG AOQIUSACaAA TTGCTCACGG 1CATGCCAGT TGCTGSTTOT ACACTCQAAC 3120 

GAM3ACT6AC AAGTATOCTC ATCATGATGT GACTCACaiTA GCTGClGACT TTTTCACAGA 31B0 

30 AAAATGTGTC TTACTACrGT TTGAGACTAG TGTOGTTGTA GCACTTTACT GTAATATATA 3240 

ACnCATTTAG ATGAGCATAG AATOTAGATC CTCTGRAG*^ CACTGATTAC ACTTTACAGG 3300 

TACCTGITAT CCCXACSGCTT CXXIAGiUSAGA ACAATCCTGT GASAGAGTTT AOCATTOTGI 3360 

CACTACAAGS GTACAGTAAT COCTGCACTQ GAGATGrGAa GAAAAAAAlTA ATCXGGCAAG 3420 

TATATTCTAA OGTTGOCAAA C9jCTTC1\AC& GTTGGTQGTT GAATAGACAA GAACAGCTAG 3480 

35 AlXaAATAAAT GATTOGTGTT TCAC7CTTTC AAGAGGTGAA CAGATACAAC CTTAATCTTA 3 540 

AAAGATTATT GCTTTTTAAA GTGTGIIAGTT TTATGCATGT GTGTTTATGG TTTGCTTATT 3600 

^TOaWkGftr GGATACTAAT TOCAGCATTC TCTCCTCTTT GCXTCIHOGT TTTGTTTTCT 3660 

TTTTTACAOG ATAaGTTTA^T GTA^CGTCACIk GATGftCTGGA TTAATT3W0T OCTAAOTTAC 3*720 

. TACTGOCATA AAAAaCTART AATACAATOT CACTTTATCA QAATRCTAGr XTTAAftAGCT 3780 
^•U GAATOTTAA 



Seq ID HO> 4B0 Prpt^in fleguence 
Protein Aocessioa tts HP 002305 



45 

50 

55 

60 

65 

70 

75 

80 



I 

MCG&AIAFFT 
IiGPBCSGIfCVQ 
GBVSIQLRSG 
SHDPBLGPOS 
VHRaKISGHI 

VFxnxaicaxiK 



11 21 

1 I 

AAFVCL^^R RGPASFLmA 



CVDBTFM>SK 
WGKYCBKm) 
CSSRGTCVGG 

PYItVSASXKD 



AEANFHLKVH 
YVDKTVSPYI 
DTPEGGFDAM 
HNVYVKfiTTM 
AfSLNXniVVBA 
arVTVTMKKiCD 
CFQCDEKKCT 

RCBCIDPRSZ 
D(2T&£CFS&P 
KX»ZIdQ3VCTH. 



PLKKYPVDLY 

lqaavce;shi 

BHPSliGOL&B 
YQKLISSVIV 
VTGGKKYAII 



CAGHGBCEAG 
GRPCEHCPTC 
SYZiRIFPIIF 



31 
I 

WVFSLVW3LG 
&KGGSVDSIE 
-XJiVDVSASHa 

c&dyehjdcmp 

GKRKBAKRUIt 
KLIDHNIKVI 
QfVEHClVQaiY 
XPIGFHSTAK 
CKSHKDQFVC 
ROQCFSGHHG 



51 
1 

KAASCAHGIiA 



IVTFIklGItLIC 
EIKHDIGKLET 



YPSVBVI3PT 
lilSIBKUaavG 
PHGYIHVL8I* 
IiVHIDC2T£HL 
FAVQGKQFHN 

IHIHHNCSC^ 
SCSIGVCVOGK 
PRCQCPSAAA 
MQCLHPBlilLS 
VIiXXROVIM) 



NDL6RKMAFF 
^SSTITBFEKA 
ALD6ICLA6IV 
TKDDLPliliPG 
SRKPO^EGCR 
CEDNROPKOK 
CSCHKIKU3K 
QHCVlf&KGQV 
QAILDQCBCrS 



9eq ID NOf 481 DMA s^ciuenjce 

MUclelc Acid AcceBslon ft= 19HL003318.1 

codina sequences 1..2574 



X 
I 

ATG6AATC0G 



ATTTCXGCAS 
AAOCCAiSAGG 

QATGCTcrrr 

GATAAATATG 
GCTATTCAAG 
AAATTTGCTT 
AAAAGTAAAC 
QAAATTGCCC 
AAGAATTTAT 
CATTTACAGA 
TTATATGGAG 

caaactaaca 
agcocagatt 
acctctagat 
tocxgtgaat 
tcaoatgaaa 



11 
1 

ASGATTTAA6 
AAAATAAGTT 
ATACTACACA 
ACIGGTTGAG 
TAAATAAATT 
GCCAAAATGA 
AGCCnSATQA 
TTGTTCATAT 
AACTTCTTCA 
TGCGGAATTT 
CASCATCTAC 
ATAGGAACAA 
AGAACAT6CC 
AAACTAAACA 
GTGATGTGAA 
CAGAATGOOG 
TAAGAAATTT 
AGftSTZCTGA 



21 
t 

TGGCA6AGAA 
TAAAAAIGRA 
TAACrCGGGA 

TUTQinracTC 

GRTTGGrCXSr 
SAQTTTTGCr 
TGCAGGTGAC 
ATCTTTTGCA 
AAAAOCIGZA 
AAACCPCCAA 
GGTATTAACT 

cagttgtgat 

AOCACAAGAT 

gtcatgccca 

GACA0ATGAT 
AGATTTGQTr 
AAAGTCTGTr 
ACTTAITATT 



31 
1 

TTGACAATT9 

GAcerrACTc 

ACTGTTAACC 
AAACTAGAGA 
TACAGTCAAG 
AQAATTC3UW3 
TACTTTCAAA 
CAATTTGAAC 
GAAC6TGGAG 
AAAAAGCAGC 
GCCCAAGAAT 
TCCAGAGGAC 
GaU3AAATAS 

tttggaagag 
tcrgttgtac 

GTGCCTGQAT 
CAAAATASIC 



41 

! 

ATTCCA3AAT 
ATGAACTAAG 
AAATTATGAT 
AAAACAOrGT 
CAATTGAAOC 
TGAGATTTGC 
TGGCCAGAGC 
TGTCACAAGG 
CAGTAOCACT 
TGCITTCAGA 
CATTTTCOGG 
AGACTACXAA 
GTTACOGQAA 
TCCCAGTTAA 
CITGTTTTAT 



ATTTCAAQGA 
XAACCCTGAA 



51 
I 

CTT^ATAAA 
GATGG<3VAAC 
TOOGCTAAGT 
GCTTCCXSOCA 
TGAATTAAAA 
AKACTGCAAG 
TAATGTCAAA 
AGAAATGCTG 
GGASGAAAAG 

ttowzttqqg 

ASOCAGGTTT 
TTCATTGAGA 
CCTTCTAAAT 
GAJUUkGACAA 
TGGAAAmAT 
AOCICTGGTG 
GAATAAAAOG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 

120 
IBO 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 
1080 
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OAATCAAGTC TTCTAGCTAA ATTAG2WVGAA 
GAGAfiTAACC AOAAACAfiTQ GCAATCTiUUa 
OCTGCATCTT CAAATCACTG GCftSATTGOG 
AAACKXIUIXA CTTTTGAGCA ACCTCTCTTT 
5 ACATCTAAAT GGTTTGSICCC AAAATCTATT 

GATTACATQA QCTOTTTTAG AACTCCAGTT 
TT6TCAACAC CTTATGGGCA ACCTGCCIGT 
ACTOCftCTTC AAAATTTACA aorTTTAGCh 
AAAGGAAGAA TTTATTCCAT TTTAAAGCAQ 

10 CRGGTGTTAA ATGAAAAGAA ACaaATATAT 

GATAAOCftAA CTCTTGATAS TTACCGQftAC 
CACAGTGATA AGATCATCO? ACTTTATGAI 
OTAATCSQAiST GTGQAAATAT TGATCTTAAr 
OCATGGGAAC GCAAGRGTTA CTG(3AAAAAT 

15 C!ATt3GCATTS TTCACAiSlGA TCTTAAACCA 

AAGCTAATTG ATTTTGQOAT TGCAAACXSA 
GATTCrCASG TTGGCAQ^T TAATTATATG 
TCXaSAGAGA AIIGGGftAATC TAAGTCRAAG 
GOATOTATTT TGTACTATAT GACTTAOGGG 

20 ATTTCTAAAT TACATGCCAT AATTGATCCT 
GAGAAAGATC TTCAAOATGT GTTAAAGTGT 
TCCATTCCTG AGCTOCTGGC TCRTCCCTAT 
ATOQCCAJUSS 6AAOCACTGA AGAAATGftAA 
TCTOCTAACT CCATTITSAA AOCTGCTAAA 

25 AGTCATAATT CTTCRTCCTC CAAGACTTTT 



ACTAAAGAGT ATCAAOAAOC AGAGGTTCCA 1140 

AGAAAGTC^AG AGTGTATTAA CCAGAATCCrT 12 DO 

GAGTTAGCCC GAAAAGXrAA TACAGT^GCAG 1260 

•KMTTraiA AACAGTCACC ACCAATATCA 13 20 

TGTAAGACAC CAAGCAOCAA TACCTIGQAT 1380 

GTAAAGAATG ACTTTCCACC TGCTTGTCAG 1440 

irrCCAGCAGC AACAGCATCA AATACTTGCC 1500 

TCTTCTTChG CAMlTaAAIG CftmOGGTT 1560 

ATAGGAACXTQ QAOGTTCAAG CAA0GTATTT 1620 

GCXATAAAAT ATGTGAACTT AGAAGAAGCA 1680 

GAAATAGCTT ATTXGAATA?^ ACTACAACAA 1740 

TATGAAAO'CA OGGACCAGTA CATCIACATG IdOO 

AGTTQ G CTTA AAAAQAftAAA ATCCATTGAT I860 

ATGITAGAGG C2VGTTCACAC AATCCATCAA 1920 

GCTAACTTTC TQATAGTIGA TGGAATGCTA 19B0 

ATGCftACCAG ATACAACAAG TOTTGTTAAA 204O 

CCACCAQAAB CAATQUUV&A TATGTCTTCC 210 Q 

ATAAGOCCCA AAAGTGKTQT TTGGTCCTTA 2160 

AAAACACCftX TTCAGO^GAT AATTAATCAG 3220 

AATCfiTGAAA TTGAATTTCC CGATATTCCA 2280 

TGTTTAAAAA GGGACCCAAA ACAGAGGATA 2340 

GTTCAAATTC AAACTCATCC AGTTAACCRA 24D0 

TATQTTCTGG QGCAACTTer ^TGGTCTGAAT 2460 

ACTTTATATG AACACTATAIQ TOSTBOTOAA 2520 
GAAAAAAAJhA GGGGAAAA2UV. ATGA 



Seg 2D NOj 462 Protein geguence 
Pxotein Accession KP^003909,1 

30 1 11 21 31 41 51 

111(11 

HBSEDLSGHB I/TIDSIMUKV BDIKNKPKNE DLTDBLSIMK igADTTENSQ TVNQlMMMflN 60 

HPaSDVOiSLLL KLBiQISVPI>S DftCUnOaEOR YSQAIBALPP DKYGQHBSEA RlQVUFASbK 120 

AXQEPDDAKD TEQKUWNCK KFAFVHISFA QFEIiSQQSVK KSKQiLI^QKAV CRGKVPLEHL 180 

35 filAUZHIiNLQ KKQLLSEEEK KHIiSASTVIiT AQB8P&GSLG KLQNKC3NSCD SI^SOTTKASP 240 

IiYOENMPPQD ABXGySKGLR QTNKTKQSCP FGRVPVOTaLK 9PI3CDVKTDD SWPCFMKRQ 300 

TSRSECRDLV VFGSKPSOND SCBIaRNXiKSV QHSHFKEPLV &DEKS8ELXX TDaXTItKNXT 360 

SS9£U«LEB TKEITQSPBVP ESNQRQIfQSK KKSECZNQNP AAfiSBOaHQIP BLARKVHTEQ 420 

KHTTFBQFVP SVSRQSPFIS TSKMTOFKSI CfCTPeGNTLD inrM8CF8TF7 VKNDFPRADQ 480 

40 LSTPYGQPAC FQQQQpaQILA TPLQNLQVLA SSSKSIBCI3V KGRIYfiIl>KQ IGSGGSSKVF 540 

QVLKEBOCQIY AIKYVSriiEEA DNQTIDSYRN EIAYIAIKLQQ HSDKIIRliTO VSITDQYIYM €00 

VMBCGNlDUr SHliKKKKSID PHERKSYWKN HIjEAVHTIHQ HGIVHSDliKP ANFLIVDGMb 660 

ICLIDFGIMaQ NOnXrrSWK SDSOVOTVimS PSSMKDUSS SaSNCKSKSK ISFKSDfVilSL 720 

GCIXtYXHrVB RTPFQQIIHQ ISKimilDP »HBXE?PPIP EKDDQIIVIjKC CLXRDPKQRI 780 

45 SIPBUU^HPY VQIQTHFVNQ KARGTTEEMK YVUaQLVGLH fiPHSIIaKAAR TXrYBinrSOGS 840 
SHBIS39SKTF EEOOIGKK 



Seq ZD THOz 483 DMA gegoence 
KOoleia AaLd AcceBsion 0< N»S_0Q2667.1 
50 Coding aequeiKcei 1..2651 

1 11 21 31 41 51 

1 1 I I 1 I 

A'DOGACACCr CCXX3GCICGG TGTGCTOCTQ TCCTTQCCXG TGCTGCIGGA GCTGGCGACC 60 

55 GGGGGCA0CT CTCCCAGGTC irGGTGTGTTG GTGAGGGQCT GGCCCACAC^ CKSTCAXTOC 120 

GAGCCOGAOG GCAGGATGTT GCTCA0GGT13 GACTGCICCG AOCTGGGGCT CTOGGAGCTO 180 

OCTTOCAACC TC3USOSXCTT CAGCTCCX2WC CTAGAOCTCA GTATGAACAA CKTCMnCRS 240 

CISCIOCCGA ATCGCCIGGC CM3TCTC0aC TTCCTGanaG ASTCAGGTCT TGGGGGAAAC 300 

QCTCTQACAT ACSOTTCOCAA GGGAGCATTC ACTGGGCTTT AC3GTCTTAA AGTTCITATG 360 

60 CTGCAG7ATA ATCftGCTAAG ACAGSTACOC ACASAASCCC TGGA6AATTT GCGAAGCCTT 420 

GUUirCCClGC GICTOGATGC TAACCACATC AQCTATQTOC CCCCSlAQCTG TTTCAGTGGC 480 

CIGCATT0C3C TQASGCAiCCr GTGGCTGGftX GACSUVTGCXST TAACAGAAAT CCCOOICCAG 540 

GGTTrO&GAA GTXTfOOGGC ATTGCAAQCC ATGACCTTGS CCCTGAACAA AATACACCAC 600 

ATACCAGACT ATGCCITTGG ft&AOCTCTCC AGCTTQGTAG TTCTTACATCT CCATAACSUkT 660 

65 AGAATGCACT CGCTGGGAAA QAAASIGCTTT GASX^GQCTOC ACAGGCXAGA GAC7TTAGAT 720 

TTAAATTACA ATAACC7TTGA TGAATTCCOC ACTGCAATTA GGACACICTC CAACCTTAAA 780 

GAACTACATT TCTATGACAA TCOCAICCAA TTTBTTQGGA GATGTGCTTT TCAAGATTTA B40 

C3CIGAACTAA QTUWCACIGAG TCIGAATGGT GGCTC3KVAA TAACVGAATT ZCCTGATTTA 900 

„ ACTOQAACXe CAAACCTGGA GAGTCTGACr TTAACXGOAtS CAit2A6ATCTC ATCTCTTCCT 960 

70 CRAACCGTCT QCAATCACTT ACCTAATCKI CAAGTGCTAG ATCTCTCTTA CAACCTATTA 1020 

GRAGATITAC CCAGrTTTTX^ AGTCTGCCAA AAGCTTCftGA AAATTGACCT AAGACATAAT L080 

GAAATCIACG AAATTAAAGT TGACACTTTC CAGCASTTGC TTATOCTCOG ATOGCTOAAT 1140 

riUUCll GGA ACftAAATTGC TATTAITCAC OCCAATGCAT ZTTCCACTIT GCCATOCCTA 1200 

ATAAAGCTGB ACCXATC3SIC CMCCTGCT6 TOGTCTTTTC CTATAACTGG GTTACATGQT 1260 

75 TTAACTCACT TAAAATTAAC AQGAAATCAT GDCTTACAGA GCSTSGFaMCC ATCTGAAAftC 1320 

TTTOCAGAAC TCAAQGTTAT AQAAATGCCT TATGCTTACC AGTGCTGTGC ATTTOanGTG 1380 

TOTGIbGAATG CCIATAAGAT TICTAATCAA TGGAATAAAG GOmCAACAG CftGTATGGAC 144Q 

GAiCCITCATA AGAAAOATGC TGGAATGTTT CAGGCTCAAS ATGAAOGTGA CCTTGAAGAT ISOO 

TTOCTQCITG ACTTTGAGGA AGikOCTGAAA GCCCTTCATT CAGTGCAGTG TTCACCTTCX: 1560 

SO CCAGGCCXX:! TCAAAflCCTG TOAAEADCTG CTTGATGGCT QGCTOATCAG AATTOeAGTG 1620 

TGGACCATAG CAOTTCTGOC ACTTACTTGT AATGCTTTGG TGACITCAAC AGTTTTCAGA 1680 

TOCCCTCTQT ACATTTCCCC CATTAAACTG TTAATTGOGG TCATCGCAGC flOTQAACATG 1740 

CZGACGGGAG TCXCSCAGTaC OGTOCTGGGT GGTGTGGATG OGTTCRCXXX TGGCAGCmT 180 D 

OSAGAATGGG GiroGrTTGCC AT61CATTGO TZTTTIGTCC I860 
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5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



ATTmeGCTT 
TCraHGAAAT 
ATTTTGCTCT 
AAOTATQQCXi 
TACftTGGTCG 
ACCAAGCTCT 

TTGTCCTTCT 
CTTCTGGTGa 
CCTCftCITTA 
AAACJUX2CAA. 
AClCAftGOCT 

CCftTGTCTTA 



CftGAATCATC 
ATTCTGCAAA 

GTGOCCTGCT 

ccrrccccTCT 

ACTGCAATTT 
TTGCCCTSTT 
CCTCTTTAAT 
TAGTCCCACT 
AGGAGGATC7 
GCTTGATGTC 
TGGTAACCTT 
CAGCTTATCC 
A 



TGTTTTCCTG 
ATTTGAAAOG 
GOCCTTGACC 
CTGCCTQCCr 
GCTCAATTOC 
0(3ACiUbGf36A 
GCTCTTCT^C 
AAACCTTACA 
TGCTQCArrGT 
G6TGAGCCTG 
AATTAZbCrCr 

ABTBACTGIUS 



CTTACTCTGG 
AAAGCTCCAT 
ATGGCCGCA6 
TTGCCTTTTG 
CTTTGCTTCC 
GACCTOOAfiA 
AACIGCATOC 
TTTATCAGTC 
CTCAATCCCC 
AOAAAGCAi^ 
GATGATGTCQ 
AGCAICACTT 
AGCTGCCATC 



CAGCOCTOGIA 
TTTCTAGCCT 
TTCCCCTGCT 
GGGA<SCCCAG 
TCAT6ATGAC 
ATATTTGGGA 
TAAACTGOCC 
CTGAAGTAAT 
rrOTCTACAT 
CCTAOQTCTO 
AAAAACM3TC 
ATGACCTGCC 
TTTCCTCTGT 



GCGTGGGTXC 
GAAAOTAATC 
GGOTGGCAGC 
CACCATGGGC 
CATTfiCCTAC 
CTGCTCTATG 
TGTGGCTTTC 
TAAfiTTTATC 
CTT6TTCAAT 
GACAAGATCA 
CTGTGACTCA 
TCCCAGTICC 
GGChTTTGTC 



1920 
1980 
2040 
2L00 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2S80 
2640 



8eq ID NOs 484 Protein aegwence 
PEOtein Accesalon itx NP_003fi5a.l 



1 
I 

Mi>T6RL6VI>L 
PSKIiSVFTSY 

AFRBZ.6ALQA 



TGXANIiBSLT 
EIYBIKVDTF 
LTHLKLTGNH 
SLHKKDAGMF 
VrriAVLAZiTC 



lUCALIAItT 
TKLYCHIiDECG 

TQAIiVTErrSS 



11 
1 

SliPVLajQZiAT 
LDLGKNNieQ 
TEALQSNIiRSIi 
MTIALNKIBH 
TA2RTIiSNIjK 
LrGAQISBIiP 
QQLLSL&GLH 
ALQStilSSEN 
QAQDERDI.BD 
UAIiVTSrVFR 
VSCH7IGFLS 
KAAVFLLGGS 
DLEEIIHDCSK 
I.NPUiYI.£iFH 
SITYDLPPSS 



21 
I 



31 
I 

LROCPTBCHC 



51 
1 



QSlSRlSDMmJ 
IPDYAPGNIiS 
ELHFYEttlPIQ 
QTVCEBQLENZi 
LAHMKIAIZH 
FPEIiKVlErtP 
FIjIiDFSEDIiK 
6PI>YI8:PIKI» 
IFASESSVFI. 
KVQASPLCI.P 
VKHIAI*riI*PT 
PHFKBDLV6I, 
VPSPAYPVTB 



SYVPFSCFSQ 

FVGaSAFQSIIi 
QVUniSYNLL 
PNAFSTUPSIi 
YAYQCCAPOV 
AIMSVQCSPS 
LIGVIAAVUM 
Z.TIAAX<ERGF 
I.PFGEPBTMG 
NCIUffCPVAF 
ItKQTYVWTRS 
SCHIiS5VAFV 



ALTYIPXCAF 
li&SIiSUIMLD 
SIK3LGKKCF 
PEIjRTXjTLNG 
EDLPSFSVOQ 
IKUTLSSULIi 
CEXZAYKISHQ 
PGPFKPCBEIf 

SVlCYS'AKPBr 
YMVaiilZiLNS 

lsfssliult 



TGIiYSUCVZM 
DNALTEIPVQ 
DQLUSLBTIjD 
ASQITEFPPIi 
KZiQKlSUUIH 
SSFPnOUBS 
WNKSDNSSMD 
IiDGHIiZRlGV 
GV12AFTF68F 
KAPFSSL1CVI 
IiCFLHMIIAY 
FISPBVXKPI 
DDVSRQ8CDS 



PCL 



Seq XD HDs 4B5 DBA gequence 

Ikclelc Acid Accessioa #s ttM_005756.1 

fV>rl^Tio sequence: 73,^3117 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 



1 11 21 31 41 51 

I I I I 1 i 

AGCCAGOCOG AC3GA0GCGAG OOGCRGGTGT GCACW3AGGT TCrTCCACTTT GTTTTCTGAA 
CTCGOGGTCA GGATGOTTTTr CTCTOTCRGG CRGTGTGGCC ATGTTGGCAQ AACIGAAGAA 
GTTTTACTGMi CBTTdUUSAT ATXCCTTGTC ATCATTTGTC TTCAItSTCGT TCT06TAACA 
TCGCXGGAAG AAGATACTGA KAATTCCAOT TrGTCAOChC CAGCIGCTAA ATTATCTGarT 
QTCAOrrrns CCCXXTCCTC CAKIGnGGTT GAAAO^ACAA GCCTCAATGA TGTTACTTTA 
AGCTTACTCC CTTCAAACJOA AACAOAAAAR ACTAAAATCA CXATAGTAAA AAdCTTCAAT 
QCITCAGGCG TCS^AACCCCA GAGAAATATC TGCAATTTGT CKlCVMrXTG CAATGACTCA 
GCATTTTTTA GAGGTGASAT CATGTTTCAA TAXGATAAAG AAAGCACTOT TCCSCCAGAAT 
OUUCnXATAA GGAAV3GCAC CTTAACTGGA GTCCIGTCTC WU9CGAAIT AAAAOGCTOL 
GAGCTCAAO^ AAAOCCTBCA AACCCCAA6T GAGACTTACT TTATAATOTO TGCTACAGCA 
GAGGCCCAAA GCACATTAAA TTGTACATFC ACAATAAAAC !EGAATAATAC TMTGAATGCA 
TGTOCTGCAA TAQCOQCTTT GGAAAGIU3TA AAGATTOGAC CAATGQAACA CTGCTGGTGT 
TCTGICAGGA TAOOCIGOGC TTCCTCOCCA QAAGAGTIGG GAAAGCTTCA G^CQTGACCTG 
OU9GATCCCA TTGTCZGTCI TGCIGAGCAT GCACGTGGOC CAOPOT^TC TTCX3U3C9CAA 
TCCIATCCCAG TGGSTGCCTOG OGCC3UCTOTG ClTTQOCAfiS VOOCCAAAGC TACCTCTTTT 
GCTGMiCCTC CMSKSTKTTC ACCXCtTGACC CZVCAATGTTC CCTCTCCAAT AGGGGAGATT 
CAAOOCCTTT CAOCOCAGOC TTCAQCTCCC! ATAGCTTCCA GCCCTGCCAT TGRCATGCCC 
CCACAOTCDG AAAGGATCTC TTCCOCTATG GOOCAAACCC ATGTCXCOGG CAOCCCACCT 
CCT8TSAAA9 CCTCATTTTC CTCTCXXIACK! GTGTCTGCCC GTGCGAATQT OMCACXAOC 
AGOGCS^CIC CTGTOCAGAC AGACATOOTC AACAOCAGCA GTATTTCTGA TCTTGAGAAC 
CAAGTGTT(3C AQATGOAOAA GGCTCIGTCC TTGGGCAGCC TGGAGCCTAA CCTCGCAGGA 
GAAATGATCA ACCAAGTCAG CA6ACTCCIT CATTCCCCGC CTGACATGCT GGCCCCTCTG 
GCTCAAAQAT TEQCTGAAAGT AGTGGATGAC ATTGGCCTAC AGCTGAACTT TTGAAACAOG 
ACTAlTAAGTC TAAOCTCCCC TTCTTTGQCT CTQGCSCTGA TCAGftSTGAA TGCCASTAGT 
TTCAAObCAA ClACCTTTfiT GGCCCMtSMl GCTSCAAATC TTCAGGTTTC TCTGGftAACC 
CAAGCTCCT6 AGAACAGTAT TOGCACAATT ACTCTl'OCTT CATOGCIGAT GAATAATTTA 
CCAGCTCATG ACATGGAGCT ASCTTCCAGG GTTCAGTTCA ATTTTITTGA AACACCIGCT 
TTGTTTCAG6 ATCCITCCCT GGAGAAGCTC TCTCTOATCA GCTAOGTCAT ATCATOOAGT 
GTTGCAAAOC TQACOSTCAG GAACTTO&CA. AGAAA06XGA CAGTCACATX AAAGCACATC 
AACGGBAGCC AGGAXGACn* AACA6TSAGA WWJTATTTT GGGACTTOGG CAGAAATGGT 
GGCAGAGGAG GCTGGTCAGA CAATGGCTGC TCTGTCAAAG ACAGGAGATT QRATGaAACC 
ATCTtSTAOCT GTftGCCMCCC AACAAGCTTC GGOQTTCTGC TGGACCTATC TAGGACATCT 
G'rGCI6CX:TG CTCAAATGAT GQCTCTOACG TTCAXTACAT ATATTGOTTQ TGQGCTTTCA 
TCAATTTTTC TCtXCAOTGAC XCTTGTAIUK] TMX^GCTT TIGhAlWSAT COGGAlGGGAT 
TAOCXriTCCA AAATOCTCAT CCAGCIGTGI GCTGCTCTGC TTCTGCTGAA CCTOGTCrrC 
CTCCXSQRCT CGTGGATTGC TCTGTATAAG ATGCAAQGOC TCTGCftTCTC AGTGGCTGTA 
TTTCTTCATT ATTTTCTCTT OGTCTCATTC AiCSVTGGATGG GGCTAGAAGC ATTGCATATG 
TACCTO60C3C TTGTCAAAGT ATTTAATACT TACATOOGAA AATACATCCT TAAATTCTGC 
ATTGraaBTT GGGOGGTACC AGCTGTGGTT OIGAOCATCA TOCTGACTAT KSCCCCM3KS 
AACTATGGSC TTGQATCX^rA TGS6AAATTC O0CAAT6GTT CACGGGKTGA CTrCXGCTGa 
ATCAACAACA Al!GCAGTKrr CTACATZACSG GTGGIGGGAT ATTTCTCaTOT 6ATATTTTTG 
CIGAACXSTCA QCtaGSTQa TGTOCTCCTG GTTCAOCTCT GTCGAATTAA AAAGAAGAAG 

1142 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
€60 
720 
780 
840 
900 
96Q 
1020 
1000 
1140 
1200 
1260 
1320 
1380 
1440 
ISDO 
1560 
1620 
1680 
1740 
IBOO 
i860 
1920 
1980 
204D 
2100 
2160 
2220 
2280 
2340 
2400 
24CD 
2520 
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C?iACK5GGnjG CCCRGCGAAA iUVCCAGTATT CAAGRCCTCA GOAGTATOGC TOGCXTTTACA 2590 

TTTTTACTGO GARTAACTTO <3GGGTTTGCX: TTCTTTQCCT OGOGACCAQT TAAOQTOACC 2640 

TTCA1GTATC TGTTTSCCAT CTTTAATAGC TTACAAGGAT TTTTCATATT CATCTTTTAC 2700 

TQTGT{3QCCA. AAGAAAATGT CAGGAASCftA TGGAJ3G0SGT ATCTTTGTTQ TBQAAJU5TTA 2 760 

5 CGGCTGGCTG AAAATTCTGA CTOOAGTAAA ACTGCTACTA ATGGTTT3iAA GAAGCAGACT 2820 

GXAAAOCAAG OAGTGTCCAG CTCTTCAAAT TCCTTACsyST CAAGCAGTRA CTCCACTAAC 2880 

TCXMCACAC TGCTAGTQAA TAATGATTGC TCAGTACACG CAAQOGOQAA TGGAAATGCT 2940 

TCTACAGAGA GGAATGQGBT CTCnrTTTAGT GT^TCftSAATG OAGAXOICTG CCTTTCACXSAT 30Q0 

TTCRCTOSAA AACflSCACAT GTTTAAOSftS AAOOAAGATT OCTGCRATGG GAAAGGCXET 30ti0 

10 ATGGCTCTCA GAACGACTTC AAAOCGtaGOA AGCTTACACT TTATTOAaCA AATGTGATTC 3120 

CTTTCTTCTA AAATCRAAGC ATGATGCTTG ACA3TOTGAR A'TGTCCAATT TTACCTTTTA 3160 

CACAATGTGA GAT6TATGAA AATCAACICA TTITATTCTC GGCAACATCT GGAGAAGCAT 3240 

AAiGCTAATTA ABGGCGATGA TTATTATTAC AAGAAGAAAC CARjSACATTA CAOCATGGTT 3300 

TTTAG3U%TT TCXGATTTGG TTTCTTATCT TICATXTPTAT AAGAAGQTTB GTTTTAAACA 3360 

15 ATACAiOTAAQ AATGACTCCT ATAAAGAAAA CftAAAAAAGG TAGTGAACTT TCAGCTAOCT 3420 

TTTAAAGAGG CTAAjaTTATC TTTBATAACA "TCATATAAAG CAACTGTTGA CTXCAfiCCTG 34 BO 

TTGQTOAQTT TAGTTGTGCA TGOCTTT&TT GTATATAAi3C TAAATTCIAG TGACXX:ATGT 3540 

GTCAAAAATC TTACTTCTAC ATTrtTTIGT AXTTKTTITC TACTGTQTAA ATGTATTCCT 3€00 

TTSTAdGtAATC ATGGTIGTTT TGTCTCAiCGT QATAATTCAS AAAATGCTTG CrOGTTOCGC 3660 

20 AAATCCTA7A GCTCCTTTTG GAGATQATAT AQGATGTGAA ATACAGAAAC CTCaGTOAAA 3720 

. TCAAGAAATA ATQATCOCM CCAGACTGAG AAAATGTAAG CMGACAGTGC CACAGTTAGC 3 7 BO 

rrCATACAGTG CCTTTGAGCA AGTTAGOAAA AGATGCCCCC ACTGGGCA8A CACAGCCCTA 3840 

IXSGSTCATGe XTTGnCAAAC AGAG1GAC3AG ACCATATTTT AGCCCCACIC ACOCICITCG 3900 

GTGC2UX3ACC TGTACAGCCA AACACAGCAT CXSUVTATGAA TAOCCATOCC CTGAC06CAT 3960 

25 CCCCAOTAGX CAGATTATA6 AATCTGCACC AAGATGTTTA GCTTTATAGC TTGGCCACaa 4020 

AGAGGGATGA ACIBTCATCC AGAiCCASXSlG TCAG6AAAAT TGTGAACOTA GATGAOGTAC 4060 

ATACACTGCC GCITCTCAAA TCCXX3U3AQC CTTTAGGAAC AGGABAGTAG ACTAGOATTC 4140 

C7TCTCTTAA AZUUBSTACOr ATATAT66AA 2UUWVTCATA TTGCCGTTCir TTAAAAGGCA 42Q0 

ACTGCATGGT ACATTOTTOA TTOTTATBAC Tl3GI!ACACrC TG6CCCAGCC AGAGCTATAA 4260 

30 TTOTTTTrEA AA1GTGTCTT GAAGAATGCA CAGTGACAAS GGGAGTAGCT ATTGQGAACA 4320 

GGGAACTGTC CTACACTQCT ATTGTTOCTA CATGllATCGA GCCTTQATTQ CTCCTAGTTA 4380 

TATACZUSGGT CTATCTTGCT TOCTAOCTAC ATCTGCTTOA GCAGTGOCIC AAGTACATCX: 4440 

TTAVTAG6AA CATTTCAAAC CCCTTTTAGT TAAGTCmC ACTAAGOTTC TCTTGCATAT 4500 

ATTECAACTG AATGTTGGAT CTC3K3ACIAA OCATAGTAAT AATACACAIT TCTG TGAGTG 4560 

35 CXGACITSTC TTTGCAATAT TTCTTTTCTG ATTTATTTAA TTTTCTTQTA TTTATATGTT 4 £20 
AAAA3CKAAA ATGTXAAAA'T CAAIGAAATA AATTTQCAQT TAAGA 



5eq ZD MOi 4a6 Protein secmence 
Protein AccoBision #: VI7_005747.1 

1 11 21 31 41 51 

111)11 

HVFSVRQOGH VGRTEEVIiI^ FKXFLVIJCb HyvLVTfiliBS DTDNSSLSPP PAKLSWSEA 60 

P3SHBVETTS USOVTLSLttP SHEIGVXPQR XIICNL8STOI DSAFFSGSIM PQVDKSSTVP 12 Q 

OKOHXT&IGTL TGVX»SI«SELK RSBLHKTI^ XiSETYFIHGSl TAEAQSTLirC TFTIKIiNJITH 180 

NACAVIAALB HVKIRPKSHC CC5VRtPCP& SPEBLEKLQC PLQDPXVCLA DHEHGPPFSS 240 

SQBIPWPRA TVLSQVPKAT SFAEPFOTSP VTOSSnSEZG SIQPLBPQPS APXASSPAID 300 

NPPQ9ETZS9 PKPOIHVSGT PPPVKASFSS PTVBAPmrVN TXSAPPVQTD IVXIT8SISDL 360 

ESlQVIiQHEKA LSLGSLSEm AGEKIiaQVSR IiZaSPPDHLA PIiAQSLLKW ISDIGLQUSnPS 420 

NTTISLfTSPS LAIiAVrRVMA SSEMTTTFVA QDPAE5K3VSI. ETQAPBEJSIG TITLPSSLMH 480 

HLPAHDMELA SRVOFNFFET PAZiFQDPSLE SIiStilfiYVIS SSVAHUTVKH XiTBHVTVniK 540 

HIEIPSQDBLT V]U:VFHDLGB NGGStGGHSDH GCSVKDRRUI BIlCTC&HLT SFGVZiLDLSR 600 

T8VU?AQ(HMA IiTFITYIGCG LSSIFIiSVZI. VTVXASBSCZR StDVPSKISiIQ LCAAIiLLI^Hl. 660 

VFLUDSHIAL ITKMQGLCIGV AVFLEYFULV SFTHHGLEAP BKYLAUVKVF KTYIEUCniiK 720 

FCIVGHGVPA VWrrrLTZS PQEIYGliOSXG KPENGSPDDF CHINNHAVPY ITWGitPCVl 780 

FIiLHVSMFIV VLVQIfGRIKK KKQLGAQRKT SIOEUj^XAG LTFLIiGITKG FAFFAHGFVXT B40 

VTFHXLFAIF SITLQGFFXFX FZCVAXENVR K(2KRSYLOOG KLREiASHSDH SKTATI3GLKK 900 

QmfrQSV&SS SNSLQS&SKrS INBTTEtLinSK DCSVBAGGHG HABTEEkH678 FaVQBI^lVCEi 960 
KDFTGKOHMF NBKEID5CNOK GRMAUARXSK RGSiaFlBQH 

Seq XD 290t 4a7 Dim, seouence 
HUclelc Acid Accession #s Eos seq[uence 
Cbding sequences l.,2d04 

65 1 11 21 31 41 SI 

I I I I I I 

A1G6TTITCT CI6IC3U5GGA GTGTOGCCAT GTTOaCAGAA CTGAAGAA6T TTTACTGAl^S? 60 

TTCAAQATKT TOCTTGTCAT CAITTGICTT CATGTCGTTC TGOTAACATC CCTGGAAGAA 120 

GATACTGAXA AnMCAGTTT GTCACCACCA CCIGATGTTA drTTTAAGCm ACTCCCTTCA 180 

70 AAOGAAACA6 AAAAAACTAA AATCACXATA OTAAAAACCT TCAATGCTTC AGGCSTTCAAA 240 

CCCXaGflGAA ATATCTGCAA TTTGTCATCT ATTIGCAATO ACTCAGCATT TTTTAGZW3GT 300 

GAGATCATGT TTCAATATQA TAAAGAAAGC ACTQTTOCCC AGAATCAACA TATAACGAAT 360 

GGC3VCCTTAA CTGQAGTCCT CTCTCTAAGT GAAITAAACA CATTAAATTG TACATTCACA 420 

ATAAAAJCIGA ATAATACAAT GAATGCA1X3T GCIGTAAl^AG CTGCTTTGGA AAGAGTAAAG 480 

75 ATTCGACCAA TQGAACACT6 CTGCT6TTCT GTCRGQATAC dTTGCOCXTC CTOCCCAGAA 540 

GAGnrTGOAAA AGCTTCAQIXS TGAOCIGCAG OATCOCATTG TCTGTCTTGC XGAOCATIZCA 600 

CSTGGOCCAC CATlTTCrrC CAGCCAATCC ATCCCAGTGQ 7GCCTCGQGC CACTGTGCTT 660 

TCCCAtSOTCC CCAAAGCZTAC CTCTTTTGCT S^GCCTCCAG ATTATTCACC T^TGACCCAC 720 

AATGTTCKCr CTCCAATAGG GOAGATTCSU^ CCCCTTTCAC CCCAGCCTTC AGCTCCCATA 780 

80 GCTTCCAQCC CTGCCMTGA CATGCCCOCA CRGTCTGAAA OGATCTCTTC CC3CTAaXSOCC 840 

CAAAOCCATG TCTCCBOCAC CCCACCICCT OlGAAAGOCT CATir£TCGTC TGCCACC3GTB 900 

TCXGCCCCTG CGAATGICAA CACTACCAGC GCACCTCCTG TCCAGAOU3A CATCBICAAC 960 

AOCAGCAGTA TTTCTGATCT TQAGAAQCAA CTGTTGCAGA TGGAGAASGC TCTGIVXTTG 103D 

GGCAGCCXGG AGCCTAACCT GGCS^GGAGAA ATOATCSU^ AAOTCAGGAa ACTCCITCAT 1080 

1143 



40 
45 
50 
55 
60 



wo 03/042661 



PCTAJS02/36810 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



TOCCGGCcre 

OCTGT6ATCA 
GCAAATCTTC 

cTTccrrcAT 

CAGTTCAATT 
CTGRTCAGCT 

6TATTTT<3GG 
GTGAAAGACft 
GTTCIGCTGG 
ATTACATATA 

AXAGcnma 



CAAGGOCTCT 
TQQATGGGCC 
A7GOSAAAAT 
AOCATCATCC 
AAT6GTTCAC 
GTGGGATATT 
CASCTCTOXC 
GACCTCAGGft 

OUGQATTTT 
AGGGG6TATC 
aCTACTAATG 
TTACIVGTCAA 
OXACACeCAA 
CAGRATGC3AG 
GAZUBATTCCT 
^%CACTTTA 



ACATGCTGQC 
TGAACTTlTC 
GAGT<3RATt3C 
AGGrTTTCTCT 
CGCTGATGAA 
TTTTTQAAAC 
ACGTCATATC 
TChCATTAAA 

GGAGATTGAA 
A^TATCTAG 
TTOGTT6TGG 
AAAAOATCOe 
TGCTGAACCr 
GCATCrCAGT 
TAGAAGCATT 
ACATCCTTAA 
TGACTATATC 
GGGMCGACrr 
TCTGT6TGAT 
OAATTAAAAA 
GTATOSCTGG 
GACChieZTAA 
TCKTATTC&T 
TTTGTTGTGG 
OTTTAAfiJaAA 
GCAfirAACTC 
GCXSSQAATGQ 
ATGTGTGOCT 
GCaATGOC3AA 
TTGAGCAAAT 



COCTCTGGCT 
AAACACGACT 
CAGTAGTTTC 
GGAAAGOCAA 
TAATTTACCR 
ACXn:GCTTTG 
ATCGAGTGTT 
GCACATCAAC 
AAATOSTOGC 
TGAAACCATC 
GACATCTGTG 
GCTTTCATCA. 
GAGGGATTAC 

GGCTGTATTT 
OC3^TATGTAC 
ATTCTGCMT 
00CM3IWTAAC 
CFQCTGGATC 
ATTTTTGCTG 
CSRAORAQCAA 
OCTTACATTT 
CGTGACCTTC 
CTTTTACTGr 
A7VAGTTAGGG 
QCAGACTGTA 
CACTAACTOC 
AAATGCaTCT 
TCACGA7TTC 
AfiSOCSTAStv 
GTGA 



CAAAGATTQC 
ATA2kGTCTAA 
AACACAACTA 
GCTOCTGAGA 
GCTCATQACA 
ITTCAGGATC 
6CAAACCTC3A 
CCXaAGCCAjQG 
AGAGGftGGCT 
TGTACCTGTA 
CTGCCTGCTC 

CCTTCCAAAA 
CTQQACTDGT 
CTTCATTATT 

GTOGGTTGGG 
TATQQGCrTB 
AACAACAATG 
AAOGTCAGCA 
CTGOGAGCCC 
TTACTQGOAA 
AXGTATCTGT 



CTGQCTGAAA 
AACCAAGOAG 
ACCACACTGC 

ACIGGAAAAC 
GCTCICAGAA 



TGAAAOTAGT 

ccTCxxxrrxc 

CCTTTGTGGC 
ACAGTATTGG 
TG6AGCTAGC 
CTTCXZCTQQR 
CCGTCAGGAA 
ATGAGTTAAC 
WTCAGACAA 
GOCATCTAAC 
AAATGATGGC 
CAX^TGACXCr 
TCCTCATCCA 
GGATTGCTCT 



TCAAAiSTATT 
GGGTACCftOC 
GASCCXAIXSG 
C3V5TATTCTA 
TGTTGATTGT 
AGCGAAAAAC 
TAACTTGGGG 
TTGCCATCTT 
AAAATGTCAG 
ATTCTGACTC? 
TGTGCAGCTG 
TAGTGAATAA 
ATGOGGTCTC 

GGACTTCftAA 



GGATGACATT 
TXTGGCTCTG 
CX»AGACCCT 
CRCAATTACT 
TTCCAGGGTT 
GAACCTCrrCT 
CTTGACAAGA 
AGTOAGATGT 
TGGCTGCTCT 
AAGCXTCGGC 
TCTGAt^GTTC 
TGTAACCTAC 

GTATAA6AT6 
CTCATTCACA 
TAATACITAC 
TGTGQTTGTG 
GAAATTOOCC 
CKXTAiCQGTG 
GGXCCTGGTT 
CAGTATTCAA 
CTTTGOCITC 
TAATACCITA 
G2UU3CAATGG 
aaOTAAAACT 
TTCAAATTC3C 
TGATTGCTCA 

:rAA€GAGAAG 
GGOaOGAAaC 



Sag ZD NO; 433 Protein geguance 
FxofceixL AcceB£iioa Hi Eos eeguence 



1 
I 

HVF3VBQ0Sa 
HBTBKTKITI 
GTLTGVIiSLS 
ELEKLQCDLQ 

aAPKHVMTTS 



ANLQIVSItEXQ 

iiigYvisaav 

VKDRRLHBTI 



nMGLEAFHHY 

PLRSZAGIjTF 

VEASGNGH 
Z^XBQH 



11 
1 

VfiRTBBVIiItT 
VKTFNftSGVK 
HijHTIiH'CTFT 
DPIVCLADHP 
PLSFQPSAFl 
APFVQTDIVff 

APEKSIGTIT 
ANIjTVRNIiTR 
CTCSHUrSFG 

IiALVXVFNTY 
MNHAVFyiTV 
liLGITtfGFAP 
lASaSSXMSKI 



21 
1 

FKIFI.VIICL 
F0RSIICIILS8 
IKU9SITMEIAC 
RGPPF8SSQS 
ASSPAHIHPP 
TSSISDXJBNQ 



IiPfifiLMNEIItP 
NVTVTIjKHIW 
VLLDLSRTSV 
AI>£JiLNZ*VFL 
ZRKYUiKPCI 
Vl?irFCVIPI.Ii 
FftHGPVNVTF 
ATHOliKKQTY 



31 
I 

mrVLVTSIiEB 
XOmSAFFRG 
AVIAAI£RVK 
IPWPHATVIi 
QSETIS8PMP 
VJjQHBKALSJj 
XfiLCfiPSIiAL 
AHDMELASRV 
PSQDELTVRC 
LPAQMMALTP 
ZJ39NIALYKK 
VGHGVPAWV 
NVfiMFlWLV 
MYLFAIFSnTi 



41 
I 

DISHSSLSPP 



IRPMESOCCS 
SQVPKArfiFA 
QXI1VSGT7PP 



AVnOWASSF 
QFHPFETPAli 
VFHDL6E2H06 
ITYIOOGItSS 
QGLCXSVAVF 
TXXttTXSEIXr 
QLCKIKKKKQ 
QCSFFIFIFYC 
LQSSSHSTN? 



Seg m KTOi 4Bd PMA sequence 

NUclelc Acid Acceeslon #i Eos seqoQnce 

Godiag aeqiieiic2«s 1..2811 



1 
I 

ATGGTFTTCT 
iTCAAdATAT 
GATACTGA1A 
GITACXTTAA 
AATTTGTCAT 
GATAAAGAAA 
CTGTCTCTAA 
AC^ACITTA 
ATAAAACTGA 
ATTCQACCAA 
GAGTTG6AAA 
CQTGGCCCAG 
TCCCAGGTCC 
AATQTTOOCT 
GCTTCCASCC 
CSkAAGCCATG 
TCTGCCCCTG 
ACCAGCAiOTA 



TOCCCOCCIQ 
GOCCXACAQC 
GCTGTGATCA 
GCZUATCITC 



11 
1 

CTGTCAGGCA 
TCCTTGTCAT 
ATTCCA6TTT 
GCTTACTCCC 
CTATTTGCAA 
GCACT6TTCC 
□TOAATTAAA 
TAIVTGIQTGC 
ATAATACAAT 
XGGAACACIG 
AGCTTCAGTG 
CATTTTCTTC 
CCAAAGCTAC 
CrCCAADAOG 
CTGCCATTQA 
TCTGOGGCAC 
COAATGTCAA 
TTTCTGATCT 
AGCCTAACCT 
AC3^TGCTQGC 
TOAACTTTTC 
GAGTGAATGC 
AGGTTTCTCT 



21 
1 

6TGTGGCCAT 
CATTTJGTCTT 
OTCACCACCA 
TTCAAAOGAA 
TQACTC3U3CA 
CCAGAATCAA 
ACOCTCAQJUS 
TACAiaCAGAG 
GAATQCATOT 
CTGCTCJTTCT 
TGACCTGCAG 
CAGCCAATCC 
CTCTTTTGCT 
aOAQATTCAA 
CATGOCCOCA 

cc^vccTOcr 

CACTAOCAGC 
TGAOAACCAA 
CGCAJGOACAA 
CCCTCTGGCT 
AAACACOACr 
CSkGXAGTTTC 
G6AAACCCAA 



31 
1 

GTTGGCAQAA 
CaTGirCGTTC 
OCT53AQQTTG 
ACAGOCGTCA 
TTTTTTAGAG 
CATATAAOBA 
CTC3VACAAAA 
GC0CAAA6CA 
GCTOTAAITAG 
GTCAGGATAC 

GATCxxavrrG 

ATCCCAGTOG 
GfiGCCrrCCAG 
COOCTTTCAC 
CAGTCIEGAAA 
GTQAAASSCCT 
GCAOCXCCTG 
GTGTTGC3^j3A 
ATQATCAACC 
CAAA6ATTGC 
ATAASTCTAA 
AACACAACIA 
GCTGCXGAlGA. 



41 

I 

CTGAAiSAjUaT 
TGGTAACATC 
AAACAAGAAG 
AACOCX^AGAG 
GTGAGATCAT 
ATG6CAOCTT 
CCXTTGCAAAC 
CATVAAATTG 
CTGCITTOGA 
CCTGCCCTTC 
rCTGTCTTGG 
TOCCTOGGGC 
ATTATICACC 
C2CX3M3CCTTC 
GGAXCICITC 
C3«TTTOC?rC 
TCCAGACAGA 
TOGAOAAGGC 
AAGTCnGCAG 
TGAAAG^rAGr 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
19B0 
2040 
2100 
21&0 
2220 
2280 
2340 
2400 
2460 
2520 
2S80 
2640 
2700 
2760 
2820 
2880 



51 
I 

PI3VTX«SZiIiF3 
TVPQMQBIIS 
VRXPCP93FE 
EIPPDYSPVTH 
VKASFSSPTV 
HZXIQfVSRLm 

m^rFVAQDP 



lUSGffSDNGCS 
IFIifiVTLVXy 



LGAQBETSIQ 
VAKSHVRKQV 
TTDIiVNNDCa 
ALRRTSKaGS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
7S0 
840 
dOO 
960 



OCTTTGXGGC 
ACAQXnnGG 



51 
1 

TTTACTGACG 
OCTGGAAQAA 
CCrCAATCAT 
AAATATCTGC 
GTTTCAATAT 
AACTQGAGTC 
CCTAASTGAG 
TACATTCACA 
AAGAGTAAAG 
CTCCCCAGAA 
TGACCATOCA 
CRCTGTGCrr 
TGTGACCCAiC 
AGCTOOCATA 
CCCTATGCOC 
TCCCM2CGTQ 
CATCGTCAAC 
TCTQTCCTTQ 
ACTOCTTCAT 
GOATGACATT 
TTTGGCTCTQ 
CCRMSUXCt 
CACAATTACT 



60 
120 
180 
240 
3Q0 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



1144 



wo 03/042661 



PCTAJS»2/36810 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



CTTCCTTCAT 
CRGTTCnATT 
CTGATCAGCT 
AACQTGACAG 
GTATTTTGGG 
GTCIUUWSACA 

ATTACykTATA. 
ATAGCTTTTG 
GCTCTGCrrTC 
CftAGGCCTCr 
TtSGATGOGCC 
ATOCGAAAAT 
ACCATCATCC 
AATGOTTCAC 
6XGGGKTATT 
CAJC3CTCTOTC 
GACCTCAGGA 
TTTC3CCTC5GG 
CAAC56ATTTT 
AGGOGQTATCi 
GAGRGGAATG 

CTCAGAAGGA 



GGCIOA1X5AA 
TTTTTGAAAC 
ACGTCATATC 
TCACATTAAA 
ACTTQQGCAG 
GGAGATTGAA 
ACCTATCTAG 
TTGGTTGTGG 
AAAAGATCOG 
TGCT(3AACCT 
GCATCTCAGT 
TAOAAflCATT 
AC3VTCCTTAA 
TCACTATATC 
CGGATGACTT 
TCTGTGTQAT 
GAATTAAAAA 
GTATCGCTQQ 
GACCAGTTAA 
TCATATTCAT 
TTTGTTQTGQ 
gggtctcttt 
ACAT6TTTAA 
CTTCAAAGCa 



TAATT TACC A 

ACC1:GCTTT6 

AToaAoraTT 

GCACATCAftC 
AAATGGTGGC 
TGAAAOCATC 
GACATCTQTG 
GCTTTCATCA 
GAQGQATTAC 



GBCTGTATTT 
GCATATGTAC 
ATTCTGCATT 
CCCAGATAAC 
CTGCTGGATC 
ATTTXTGCT6 
GAAGAAGCAA 
CCTTACSVrTT 
OGTGACCTTC 
CTTTTA^XaT 
AAAGTThOGQ 
TAaXGTTCZUS 
OGAGAAOQftA 
QGCSAAGCITA 



GCTCATGACA 
TTTCaGGATC 
GCAAACCTGA 
COGAGCCAGG 
AGfiGGAQOCT 
TGTACCTGTA 
CTGOCTGCTC 
ATTTTTCrrGT 
CCTTCC3i?W^ 
CTGGACrC3GT 
CTTCATXATT 

QTC3QGTTGGG 
TATGGGCTTQ 
AACAACAAT6 
AACGTCAGCA 
CTGGGAGCCC 
TTACTGGGAA 
ATOTATCTGT 
GTG6CX»AAG 

AATGGA6ATO 
OATTOCTGCA 
CACTTTATTa 



TGGAGCTAGC 
CTTCCCTGGA 
CCGTCAGGAA 
ATGAGTTAAC 
QGTCAGACAA 
GCCATCTAAC 
AAATGATGGC 
CAGTGACTCT 
TCCTCATCCA 
GGATTGCTCT 
TTCTCTTQar 
TCAAAtSTATT 
GGGTACCAQC 
GATCCIATGG 
CASTATTCTA 
TOTTCATTGT 

TAACITGGGG 
TTGOCATCTT 
AAAATQTCAG 
ATXCTGSAAA 

ToraccrccA 

AUGGGAAACG 
AQCAAATGTG 



ttccagggtt 

GRACCTCrcr 
CTTGACAAGA 
AGTGAGATGT 
TGGCTGCTCT 
AASCTTOGGC 
TCTGAOQTTC 
TQTAACCTAC 
GCTGTGTGCT 
CTATAAOATG 
CiCATTCaiCA 
TAATACTTAC 
TQTGGTTGTG 
GAAATTCXCC 
CKTTACGGTG 
GGTCCTGGTT 
CTGTATTCAA 
CTTTGCCTTC 
TAATACCTTA 
GAAGCAATGG 
TQCITCTACA 
CGATTTCACT 
GCOTATGGCX 
A 



1440 
1500 

1560 
1620 
1680 
1740 

leoo 

1660 
1920 
1930 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
3700 
2760 



Seq ID HO: 490 Protein sequence 
Protein Accession #s Eos Beguence 



1 
I 

HVFSVRQGGH 
VTLSLIiPSHE 
IjSLSEIiKRJSE 
IRPMKHCCCS 
SQVPKATSFA 

GSLBIMXjAGB 
AVZRV19ASSF 
QEZlFFSTPAL 



QGIiCXOTAVF 
TIILTISPDN 
QliCRIKKKKQ 
QQFFIFXFlfC 



11 
I 

TQVKFQUNIC 
LNKTLQTIiSE 

EPPDYSPVTH 
VKA&FS&FIV 

STTTFVAQtIP 

RGGHfiDMGCe 
IFLSVTIiVTY 

TGLGSTCJCPP 
U3AjQRKTClQ 
VAKENVRKQW 



21 
I 

FKlPbVirCL 



TYFIMCATAE 
ELEKIjQCDXiQ 
HVPSPIGEIQ 
SAPAHVNTTS 
SPPDMUkPIiA 
AKLQVSXfBTQ 
liIEYVIfiSSV 
VKDHRIiNETr 
lATEKZBSDy 

DLRSIAGLTF 
SRYLCCGKIjR 



31 
I 

HWLVTSI«EB 
FFEBBIMFQir 
AQSTLWCTPT 
DPIVCUiDSP 
PLSPQPSAPX 
APFVQTDIVK 
QRLUCWDDX 
APENSIGTIT 
ANlTVHNIjTR 
CTCSHIiTSFG 
PSKZI>ZQLCA 
IiAIiVKTFNTY 
WMMAVFYrTV 
JiiliOITHOFAF 

HFIKQK 



41 
I 



DKBSTVPQNQ 
iKUnrrHE^AC 

RGPPFSSSQS 
ASSPAIDNPP 

QEiQURPSirrr 

laPSSUlNNIiP 
UVTVTIiKHlN 
VLLDLSRTSV 
AI.riIiI2n.VPL 
IRKYIIiKPCX 
VOyPCVIPLL 
PAWGPVNVTF 
EKNGVSFSVQ 



SI 
I 

PEVBTTSXiND 
RX^SGTIiTGV 
AVIAAI>ERVK 
IPWPRATVL 
Q6ETI6SPKP 
VZiQHEKALSli 
ISI>TSPSI>AIj 
AHDMEI.A3RV 
PSQDSLTVRC 
UAQHMALTP 
U>SHIALYKK 
VOHOVPAWV 
MVSMFIWLV 
MyXiFAlFHTlj 
NQDVCLHDFT 



Seq XD NO: 4 91 DWA gequence 

nucleic Acid AccesBlon tt± Bob sequence 

Cdding a^qoencet 1..3045 



1 11 21 

I I 1 

ATOCSTTTTCr CTGICAGGCA GTGTGGCCAT 
TXCAAGATAT TCCTTIQTCAT CATTTGTCTT 
fiAIACTOAlA ATTCCAGTTT GTCKCCACCA 
GGCXCCTOCA ATGAGGTTOA AACAA£3Ui^ 
TCAAACGAAA CAGAAAAAAC TAAAATCACT 
AAAOCCCAGA GAAATATCTG CZAATITTGTCA 
GGTGAGATCA TGTTTCAATA TGATAAAQAA 
AATGGCACCX TAACTGGAGX CCIGTCTCTA 
AlCC!C3X3GAAA COCTAAGTOA QACTTACTTT 
ACATTAAATT OTACATTCAC AATAAAACTO 
GCCGCTTTGS AAAGAGTAAA C3ATTOGACCA 
CCCTOCSGCIT CCTCCCCAGA AGA0nm3A 
GICIGICTTG CTGACCATCX: A OMSG CCXa^ 
OIQCCIOGSG CCACIGTGCI TTOOCAQGXC 
GATIATTCAC CTGTGACCCA CAATGTTOCC 
CCOCWaCCTT CAGCTCCCaT AGCTTCCW3C 
AGGATCTCTT OCCCTATGCC CCAAACCCAT 
TCATTTTCCT CTCCCACCGI GTCIGCGCCT 
erOCAGACAG ACATOSTCAA CACCAOOiGT 
ATGOAGAAQQ CTCTaTCCTT QGGCAGCCTC 
CAAGTCAGCA GACTOCTTCA TTCCCCQCCT 
CTQAAAGTAG XGGATGACAT TGQCCTACAG 
A<X2TGG0CTT CTTTGSCTCT GGCTO30AIC 
ACCTTTGIGG CCCAAOACCC TGCAAATCTT 
AACAOTAT!TG GCACAATTAC TCTTOCTTCA 
ATGGAGCTAG CTTCCAGGGT TCAGXTCAAT 
CCITCCCTG3 ASAACCTCTC TCnSATCAGC 
ACOSTCAGGA ACTTGACMG AAAGGT6ACA 
GAaBAmriAA CASIOAQATO TQTATTTTGG 
TGGTCAIGACA ATQGCTGCTC TGTGAAAGAC 
AGCCATCTAA CAAGCTT06G OGTTCTGCIG 



31 



120 
180 
240 
300 
360 
420 
480 
540 
60O 
660 
720 
7B0 
840 
900 



41 51 

I 1 I 

GTTQGCAGAA CTGAAGAAGT TTTACraACQ 
CATGTOQTTC TGOTAACATC CCTGGAAGAA 
CCTGCXAAAT TATCTGn^OT QWQITTTGGC 
CTCAATGATG TTACTTTAAG CTTACTCCCT 
ATflGTAAAAA CCTTCAATGC TTCAGGCJGTC 
TCTA'i'i'TUCA ATGACTCASC ATTTTTTAGA 
AGCACTGTTC CGCAGAAXCA ACATATAAlOG 
AGIQAATTAA AAOGCICAGA GCTCMfSlAA 
ATAAT6TGTG CTACAfSCIUBA GGOCCAAAGC 
AATAATACAA TGAATGCATG TGCTGCAATA 
ATGGAACACT GCIGCTSTTC TGTCftSCATA 
AAGCTl:CAGT GTGAOCTOCA GORICCCATT 
OCATTTTCTT OCAGCSCAAXC CATOOCAGXG 
OCCAAftjCsCTA OCXCTTTTGC TGASGCTOCA 
TCTCCAATRiQ GGGAGATTCA ACCCCTTTC». 
OCTGCCATTG ACATGCCOCC ACAGTCTGAA 
GTCTCGGGCA OCCCACCTOC rTGTGAAAGCC 
QOBMa&EC^ ACi^GTAOCAG OGCAPCTCGT 
A'nTCTGATC TTQAQAAOCiA AGIGITGCAG 
QA60CTAACC rTGGCSVSGAGA AATQATCAAC 
GACATGCTGG CCCCTCTOCSC TCAAAGATTG 
OrrGAACTTTT CAAACAOQAC TATAAGTCIA 
AGftSIGAAIG CCAGTAGTXT CAACACAACT 
Cag GT TT C TC TGGAAACCXA. AQCTCCTGAG 
TCBCTGATGA ATAATTTACC AGCTCATGAC 
TTTTTTGAAA CACCTGCTTT GTTTCRGGAT 
TAGGTCATAT CATGGAGTGT TGCAAACCTQ 
OTCACATTAA AiGCACATCftA CCCSAGCCA6 
GACIT«9aaC3L GAAATGGIGG Ca^GAQGAOGC 
AGGAGATTBA ATOAAACCAT CTQiEaCCXGT 
GACCTAWCA GGACATCTGT GCTGCCTGCI 

1145 



60 
120 
180 
240 
300 
360 
420 

4ao 

540 
60O 
660 
720 
TOO 
840 
900 
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1020 

loeo 

1140 
1200 
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1620 
1680 
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180Q 
1860 
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CAAATGATGG 

A-XCCTCATCC 
TGOATTGCTC 
TTTCTCTTGa 

GOOQTACCAG 
GGATOCTATS 
GCAGTATTCT 
AT6TTCATTQ 
CAGGGaAAAA 
ATAACTTQGa 
TTTGCCATCT 
GAAAATGTCft 
AATTCTGACP 



CrAGTGMTA 
AATGOGGTCn: 
CAGCACATGT 
AG6ACTTCAA 



CTCTGACOTT 
TTGTAACCTA 
A13CTQTQXGC 
TCTATAAGAT 
TCTCATTCAC 
TTAATACTTA 
CTOrGGTIIsT 
GGAAATTCCC 
ACATTACGGT 
TQSTCCTQGT 
GCAlC^TATTCA 

TTAATACCTT 
C3GAAGCAATG 
GGAQTAAAAC 
CTTCAAATTC 
ATGATTGCTC 
CTTTTAGTGT 
TTAACJGRGAA 
AQOGGOGAAG 



CATTACATAT 
CATAQCTTTT 
TGCTCTQCTT 
GCAAGGCCTC 
ATG0ATGGGC 
CATCGGAAAA 
GAOCATCATC 
CAATGGTTCA 
GGTQGGATAT 
TCAQCTCTGr 
AfiACCTCAOS 

ACAAGGATTT 

CTTACAGTCA 
AiS^ACACGCA 
TCAOAATQOA 
GGAAGATTC5C 
CTTACACTTT 



AXTGGTTGTG 
GAAAA^TOC 
CTGCIGAAOC 
TOCATCTC3V.G 
CIAGAASCAT 
TACATCCTTA 
CTGhCXATAT 
COSGATGACr 
TTCTC5TGTGA 
CGAATTAAAA 
AOTATCQCTG 
GGACCAGTTA 
TTCATATTCA 
CTTTGTTGTG 
GGTTTAAAGA 
AGCAGTAACI 
AGOGGGAAT6 
GArrGTGTGCC 
TGCAAIQGOA 
ATTGAQCAAA 



OGCTTTCATC 
GGAISGGATTA 
TGGTCTTCCT 
TC3GCTGTATT 
TCCATATQTA 
AATTCTGCAT 
CCCCAGATAA 
TCTGCTGGAT 
TATTTTTGCX 
AGAAQAAfiCA 
GCCTTACATT 
ACGTGACtmC 
TCTTTPACTG 
GAAAOTTACS 
AGCAGACTGT 
CCACTAACTC 
GAAATGCTTC 
TTGAOQATTr 
AAGGCOGTAT 
TGT6A 



AATTTTTCTG 
CCCTTCXAAA 
CCTGGACTCQ 
TCTTCATTAT 
CCTQGCCCTT 
TGTCQGTTGG 
CTATGGGCn'T 
CAACRACAAT 
GAACGTCAGC 
ACTGGQAGCC 
TTTACTGGGA 
CATGTATCTG 
TQTGGCCAAA 
GCTGGCTGAA 
AAACCAAGGA 
CAOCACACTQ 
TAOUSAGAGG 
CZUrCGGAAAA 
QGCTCTCAGA 



1?20 
1980 
2040 
210D 
2X60 
2220 
22B0 

240a 
2460 
2520 
25B0 
2640 
270D 
2760 
2fl20 
3BB0 
2940 
3000 



Seq ID HQs 492 proceln eequenee 
Protein Acceaeioa #t bob sequance 



1 
1 

MVPSVHQCX3S: 

VCIiftDHPHGP 
PQPSAPIASS 

VQiTDIVHTSS 

KSIGTZTLPS 
TVRNIjTRHVT 
SHLTSPGVJjIi 

VKVFNTYIKK 
AVPylTWGY 



NGVSPSVQHG 



1 

VOSTBBVIiI»T 
LNEfVTI,SE.T>P 
STVPQNQHIT 
HNTmACAAJ 
PF5&SQ&IFV 
PAIDHPPQSB 
ISDIiEKQVIiQ 
LJJIFSKTTISL 
SLHNHIiFAHD 

vhiIchinpsq 
blsrtsvz.pa 

liLKLVPLIiDS 
YILKFCrVGW 
FCVIFLiajVS 
GPVElVTFMyii 
GLKKQTVHQG 
DVCLHDFTGK 



21 
I 

FKIFLVJIdi 
SnSBTSaCTKlT 
NGTI/TGVXtSIi 
AAliERVKIRP 
VPBATVIiSay 
TISSFHPQTH 
KEKAL3XiaSI> 
TSPGLAIAVI 
HELASRVQFN 
BEIiTVRCVPtt 
QMMAI/TPITY 

GVPAVWTIJ 
KFIWliVQLC 



31 
I 

HWIiVTSXtBS 
IVKTFHASQW 
SEHjKRfiELNK 
HEHCCCSVRl 
PKATSPAEPP 
VSOTPPPVKA 
EFNIiAOEMIH 
KVNAS3FVTT 
FFSTPAItFQD 



41 

I 



IGCGLSSIFL 
CXSVAVELHT 

i/nepraJYGL 

RIKKKKQIiOA 
FIFIPYCVAK 



TLQTIiSBTYF 
FCFCSPEELG 
DY3PVTHHVP 
SFSSPTVSAP 
QVSRLIJISPP 
TPVAQDPANL 
PSLEffiUSEiIS 
WSDIiGCSSVKD 
SVTUVTYXAF 
PLIiVGFXKMG 
GSYGKFFKOS 
QEUCTSZQDIrH 



51 
I 

PAKXiSWSFA 
8ICNDSAFFR 
IMCATAEAQ9 
KLQCDLQDPI 
GPIGBIQPX^ 
ANVHTTSAPP 
DNLAPLAQSIi 
QVSIiETQAPB 
YVX£fiSVAKL 
SRUIBTICTC 

LEAFKHYLAIi 
EDDFCVTHflZni 
SIA0X»TFZJiQ 
XiCOGECIiRLAB 



LVHNDCBVHA 
RT8KRGSI1BF 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



ISQM 



Seq ID MOs 493 PMA sequence 
ISUc3.elc Add Accession #: I9H_015507 
(Tnrtlng sequence 1 2 41.. 1902 



1 XI 21 3X 41 51 

1 I I I 1 I 

GOGCAGftGGA GCCTCGGCCak GGCTSVGGCaG GGCGCXXZCCA GOCCCTCCGC AQGCCGCGAG 60 

aaOCCCTGCC GGOGTGCCTG GCCTCCOCIC CC&GACTGCA GGOACAGCAC COGGTAACTS 120 

C3GAGTGGAGC QSAGGACCCG AGGGGCTGAG GAOAGAGGAG GCGGCGGCTT AGCTGCTAOG 180 

OGGTCGGSCC GG06C0CTCC CGAfSGOOGGC TCAGGAGQAO GAA6GAGGAC O0GTGC?QAGA 240 

ATQCCTCTaC CCTGGAGCCT TGOGCTCODS CTQCTGCTCr CCrGGOTOGC AGGTGGTTTC 300 

GGGAAC3GGG6 CCA0TGCAAQ GGKK9U3SGG TTGTZAGC3a' C3GGCAOGTCA OCXTTOGGGTC 360 

TGTCacrATG GAACTAAACT GGCCTGCIGC rEAOSGCXGGA GAftGAAACM CAAOGGAGTC 420 

nGTTGAIVSCTA CATGOSAACC TGGATGTAAG TTTX3GTGRGT GC3QTeGGAOC AAACAAATOC 4B0 

AGATGCTTTC CflSGATACAC CGGGAAAACC TGCAGTCAAG ATGTQAATGA GTffTGGAATG 540 

AAACGOOGGC CATQCCAACA CfeGATGTGTG AATACAOU^S GAASCXACAA QTQCTSTTGC 600 

CTCAfilGGCC.ACniQCTCAT GCOUffilSCT AOBTOTGTGA ACZCTAGG^ ATGTGCX^ATG 660 

ATAAACTGTC AQT3«3iSClG TGAASACIWA GAAfiAAOGGC CRCAGTGCX^ GTGrCCATCC 720 

TCRjOGACTOC GCCTGGCCCC AAATGGAAGA GRCTGTCTAG ATATTGaWlGA ATGTGCCrCT 780 

GGTAAAGTCA TCIQTCCCTA CAATCGRAGA TGTGTGAACA CATTTGGAAG CTACTACTGC 840 

AAAT6TCACA TTGGTTTCQA ACTGCAATAT ATCAGIOGAC QATATGACTG TATAGATATA 900 

AATGAATGIA CTATGGASAfi OCKTAC39XGC AGCCACCATG CCftATTGCIT CAATACCCAA 960 

GGGTOCITCA AGTGTMATG CAAGCRGGGA TATAAAGC3CA ATGGACITGG GTGTTCTGCT 1020 

ATOCCTGAAA ATTCTGTGAA GGAAGTCCTC AGAGCACCTG GTACCATC31A AGACAGAATC 1080 

AAGAAGTTGC TTGGTCACAA AAACAGC&TG AAAAAGAAGG CAAAAATTAA AAATGTTACC 1140 

CCAGAACCCA. CCAQGAiCTCC TACOCCIAAG GTGAACTTGC AGOCCTTCAA CXATGAAGAG 1200 

AT2^GXTTOCA GftSGCGGGAA CTCXCAXQSA GSZAAAAAAG GGAATGAAGA GAAAATGAAA 1260 

GAGGQQCITG AQGATGA<3AA AAGAGAKGAG AAAGCOCTGA AGftATGACAT AGAGGAGOSA 1320 

AGCCT60GAG GAGATCTGTT TTTCCCTAAG GT<^ATGAAG CAGOTGAATl? OGGOCTGATTT 1380 

CTGGTCCAAA GGAAAGCGCT AACXTCCAAA CTQGAACATA AAGATTTAAA TATCTOQSTT 1440 

GACIGCAGCT TCAATCATGG GATCTGlGAC TGGAAACAGG ATAGAGAAGA TGRITTTQAC 15 DO 

TOGAATGCTO CIGATCGAOA TAAT8CIATT GaCTTClATA TOGCAGTTGC GOCCTrGGC^ 1560 

OGTCftCAAGA AAGACATTGG G0C9ATTGAAA CTTCTOCTAC CTGACCIGCA A0CXX:AAAGC 1620 

AACrrCTGlrP TQCTCTTTGA TTACaSQCro GCICGGAGACA AAGTOQGGAA ACTTCGAGTG 1680 

TTIGTGAAAA ACAGTAACAA TOCCCTOGCA TGGGAGAAGA CCACGAOTGA GGATGAAAAG 174 D 

TGGAA6ACAG GGAAAATTC31 GTTGTATC3^ G6AACTGATG CTACCAAAAQ CATCATTTTT 1800 

GAAGCAGAAC GTOGCAAGGG CAAAAGGOGC GAAATOSCAG TOGATGGCGT CTTGOTTGTT 1860 

TCAG(3CTTAT GTCCAiaATAG OCTTTXATCT GTGGATGACO: GAATGTTACT ATCTTTATAT 1920 

TIGACTITGT ATGTCAGTTC CCTGGTTTTT TT^TATTGC ATCATAfiGAC CTCTGGCMT 1980 

TTAGAATTAC TAGCFGAAAA ATTGTAKKST ACCAACAGAA ATATTATTGr AAGATGCCIT 2040 

1146 



wo 03/042661 



PCT/US02/36810 



TCTTOTATAA 6ATATGCCAA. TATTTGCTTT AAATATCATA TCftCTOTATC TTCrC2«STCA 2100 

TTTCTGAATC TTTCCaCATT ATATTATRAA ATATOGAAAT STCAGTTTAT CTCGOCTCCT 21fi0 

CROTATATCT GATTTGTATA AGTAAGTTGA TGAGCTTCTC TCTACftACAT TTCTAGAJ«A 2220 

TAGZ^AAAAAA AOCACAGAGA AATGTTTAAC TQTTTGACTC TTATGATACT TCTTOGAAAC 2ZS0 

5 TATGACATCA AAGATAGACT TTTGCCTAAO TGGCTTACCT GGOICITTCA TAGCCAAACT 2340 
TCTATATTTA AATTCTTTGT AATAATAATA TCC31AATCAT CAAUIAAAAA AAAAAAAA 
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Seq ID MOf 494 Protein sequence 
Protein Accessioa #5 NP^05e322 



1 
I 

MPIrFnSIiAIjP 
CBATCBPGCK 



GKVICPYNRR 
GSFKCKCKQG 
PEPIRTPTPK 
SLRGXnrPFPK 
HNPADRDfilAI 
FVKNSNHAIA 
SGliCPDSIiLS 



11 
I 

LliLSHVAGGF 
FGECVOPNKC 
TCVNSRTCZM 
CVNTPGSYYC 
YKGHGLRCSA 

VNBAGEP6LI 
GFYHAVPMjA 



21 
I 

GMAASABHHQ 
RCFSGyXGKT 
INOQirSCBOT 
KCHIGFELQY 
IPEHSVKEVIi 
tVSHGGffirSHG 
LVQSKMbTSK 



31 
I 

CSQDVHBC3QH 
EBGPQCLCFS 
ISGRYDCIDt 

RAPGTIKDRI 



41 



CKTOTKLRCC 
XPkPOQHRCV 



. NKTGKIQLYQ 



IiBBKDIiHXSV 
IiIiIiEDLQPQS 
OTDATK5I1F 



USCIMDSBTC 
KKIiIAHKMSH 

egledekreb 

DCBFKHGZCD 
HFCULFDYSL 



SI 

1 

YGHRRNSKGV 
J4THGSYKCFC 
DCLDIDECAS 
SHHAHCFHTQ 
KKKAKIKNVT 
KALKKDIEER 



AGDKVGKIiKV 
EIAV£)SVLLV 



60 
120 

lao 

240 
300 
360 
420 
480 
540 



Seq ID NO: 495 IIMA Beqafence 

HUclelc Acid Accesaion #£ 1!IK_0D3506.1 

Cbdlng sequence! 259*. 2375 



1 11 21 31 41 51 

I I I I I i 

GCnGCTCCAG TCCCGGAGQC AACCCOGGAO GCaTCTGAGG TCCCTGGGGQ GAACOGTGGG 60 

TTAfaAOQGGG AOBGGAAGGG ACAQCOOCCT TCG2VC06CCC CCOaAGTI^T TGACCCAGGA 120 

CTCRTTTTCA GGWUVGOCTG AAAATGA6TA AAATAGTOfiA ATGRGGAATT TQftACAlTTT 180 

ATCTTTC5GAT GGGGATCTTC TGnOOATQCA AftGAGTGATT CATCXSiAGCC ATGTGGTAAA 240 

ATCAGGAATT TQAAQAAAArT GGAiGAIGTTT ACAITTITGr TGACGTGTAT TTTTCTACGC 300 

CTCSCTAAfiAG GGCAC»GTCT CTTCMXrXGI GAACCftATTA CTQITOCCAG ATGTKTOAAA 360 

ATGGOCTACA AfZATQACGTT TTTOCCTAAT CTGAIGSGTC ATTATGlMirCA QMSIAfTGCC 4Z0 

GCHGTGSBAA TGGAGCATTT TCTTCCTCTC GCAAATCTGG AATGT1?CACC AAACATTOAA 480 

ACTTTCXrrcr GCftAMCaiT TGTACCAACC TGCATAQARC AAATTCATGT GOTTCCACCT 540 

TQTCBTAAAC TTTGTGAGAA AGTAXATICT GATTGCAAAA AATTAAITGA CACITTTGGG 600 

AIGGGATGC3C CTOAGGAGCI TGAATGTGAC AGATTACAAT AGTGTGATOA GACTGITGCrr 660 

QTAACTTTTG A!CGCACACAC AGAATTTCTT GCTOCTCAGA AQAAAACASA ACAAGTCCAA ?20 

AGAGACATTG QATTTTQGTG TCCAAGGCAT CTTAAGACTT CTG06GGACA AGGATATAAG 700 

TTTCTGGGAA TTGACCAGTG TGOGOCTCCA 'XGCCOCAACA TOTATTTTAA A2VGTGAT(^G 840 

CTAGAGTTTG CAAAAAGTTT TATTGGAACA GTTTCAATAT TTSGfTCXTIG TGCAACTCIG 900 

TXCACKTTCC TTACXTTtTT AATTGATGTT AGAAGA'TrCA GKXACCCAOA GAGAOCftATT 960 

ATATATTACT CIGTCTG>n?A CaSCATTGfA TCTCTTATOT ACXXCPOTQa ATTTTTOCTG 1020 

GGCGATAGCA CRGCCTGCAA TAAGGCAGAT Q7U3AAGCTAG AACTTGOTOA ChCXGTIGTC lOBO 

CXAGGCTCTC AAAATAAGGC TIGCAGCGIT ITGTTCATac: TTTTGTATTT TTTCACAATG 1140 

GC3X3C3CACIG ^CGTOGTOGGT GATTCTTAOC AT1ACTT6GT TCTTAGC3XSG AGGAAGAAAA 1200 

TGG2ySITGTG AAGCCATCGA GCMUUUUSCA GTQTGGTTTC AVGCTGTTGC ATGGGGI»£!A 1260 

CCAGGTtrrCC 'TGACIGTTAT GCTTCTTGCT CtGMUCMMS TTQAAQGAGA CAACATTAOT 1320 

QGAGTTTGCT TTGTTGGC?CT TTATGAOCTG GATQCTPOTC eCrACTTTGT ACTCTTGOCA 1380 

CiGTGCSCrrr GTGTGTT7:GT TOGGCTCTCT CTTCTTTTAO CTGQCRTTAT TTCXTTTAAAT 1440 

CATCTTOOAC AAGTCATACA ACATGATGGC CQGAAQCftAG AAAAAC7AAA GAAAlTCTATG 1500 

ATTGGAATIG 6A0ICTTCAQ OSGCTXGXAT CSTGmOCCAT TAfiXGAGACT TCICC3GAT0T 1560 

TAGGTCIATG MCAAOTGftA CAGGATTACC TOGOnSATAA CTTGGGTCXC TSATCAtTGT 1620 

CGTCnGTACC ATATCCCATQ TCCTTA7CAG GCAAAAGCAA AAI^CTCGACC AGAATTGGCT l6a0 

TTATTTATGA TAAAATACCT GATGACATTA ATTGTTOGGA TCTCTGC^GT CTTCTGQSTT 1740 

GGAIU3CAAAA AGAiCATOCAC aOAATGGGCT GGGrXTTTTA AACGAAATGG CAAG&GAG&T 1800 

CCAATCASEO AAlUSTCGAAG AGTACCftCAG GAATfCiaGIG ACTTTTTCTT AZkAGCftCAAT 1860 

TCTAAAGTTA AACACAAAAA GAIUSCACTAT AAACCAAGTT GMACftAGCT OAAGGTCATr 1920 

TCXAAATQCA TGGGAACCAG CACAGGaGCT AiCAGChAATC ATGGCSUTXIC TGCA6TAGCIV 19E0 

AITACTAGCC ATGATTACCT AGG&C?i3«SAA ACTTTQAC3U3 AAATCCAAAC CTCRCCftGAA 2040 

A£»3X2hATGA GAGAGGTGAA AOOGGAOaOA GCXAGCACXIC CCAGGtXAAG ASAACAGQAC 2100 

TGTGGXWU:: CTGCCIOSCC 2USCAGGATOC ATCTCCAiSAC TCTCIGOGQA ACAGG!CGQAC 2160 

GGGftAGGGCC AGGCnOGCAG TGTATC3GAA ASTGGGOGGA GTGUIGGAAG GATTAGTCCA 2220 

AAGft<3TaATA TTACTGACRC 'TGGCCTGGCA CAGAG<:IAACA ATTTGOUSGT COCJCSUBTTCT 2280 

TCRGAACCAA GCAGGCTCAA AGGTTOCACA TClCrTGCTTG TTCAOGCAGT TOJCAGGAGTG 2340 

AGAA7UU3AGC AGGGAGGTGQ TT^TCAITICA GATACTTOAA QAACA.TTTTC TCTGGTXACT 2400 

CAGAAGCAAA TXTGIOTTAC AC3Q8AM3TG ACCSSUraCAC ZGTTTTGTAA ISRKTChCIGT 2460 

TACGITCTTC TTTTGOICTT AMU3TTGGAT TGGCTACTOT TATACXCGAA AAAKTAOAGT 2S20 

TCRAQAATAA TATGACTCRT TTCACACAAA GGTXAATGAC AftCAATATTUT CTGAAAACftG Z580 

AAATGTGOVG GTTAATAATA TTTTTTTAAT AGTGTGGGAG GACftGAGTTA GAGGAATCTT 2640 

CCXTXTCTRT TTAlGftAGAT TCTACTCTTO OTAAQAGTAiT TTTAAGATOT ACTATGCTAT 27QQ 

TTTAGCXTTT TGATAXAAAA TCAAGAHAtTT TCTITGCTGA AGTATTTAIhA TtTTTATCCTT 2760 

CSTATCTTTTT ATACATATTT GAAAATAMSC TTKTATQTA'r TTGAACTTTT TTGAAATCCT 2820 

ATTCAAGTAT TTTTATCAT6 ClATTGaXSAT AITTTRGCAC TTTGGraGCT TTTAOWrrQA 2080 

ATTTCTAADA AAA^TTGTAAA ATflOTCTTCr TTTATACTGT AAAAAAAGAT ATACCflAAAA 2940 

QTCTTATAAT AOGAATITAA CITTAAAAAC CCACITATTG ATACCTIAOC ATCTAAAATG 3000 

T6TGATTTTT ATAGTCTCGT TIXAGGAA3T TCS^CAGATCT AAATCKIGTA ACTGAAATAA 3060 

QGTGCTTACT CAAAGAGT6T CCACTATTGA TTGTATXATQ CIGCTCACTQ ATC3CT3CT8C 3120 

ATATTTAAAA TAAAATSTCC TAA2U9GGTTA OTABACAAAA TSTTAGTCTT TTGTATATTA 3180 

GGCCAAGTGC AATTGACTTC OCTTTTTTAA TGTTTCATGA OCACCGATTG ArVSTPSCnCC 3240 
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AACCACTTAC AGTTOCTTAT ATTTTTTGTT TTAACTTTTa TTTCTTARCA TTXftGAATAr 
TACRTrTTGT ATTATACAiST ACCTTTCICA GACATTTTGT AG 

Seq ID NO 5 4 96 Protain sequence 
Protein Accession «i HP_O0349'7,l 

1 11 21 31 41 51 

t I I 1 1 I 

HEHPTFLI*XC IFLPLLRGHS LPTCEPITVP RCMKMATfNMT FFPULPH^TYD QSIAAVEMEa 60 

FLPLAlilLBCS PNIETPLCKA FVPTCIEQIH VTOPCRKLCE KVYSDCKKLI DTPaiRHPEK 120 

LECDRliQyCD STVFVTFDPH TBFliGFQKKT EOVQBDXGBVI CPRH&ICrSaG QOYKFLGIDQ 160 

C!A£>PG£NMVF K8DEX<KPAKS PZGTVSXFCL CATLPTPI/PP IiIDVRKFRYP ERPIITYSVC 240 

YSIVSUtYFI OFliLGDSTAC NKADEKUSLG T>TVVIiGfiQNK ACTVLFHUbY FFTMACTVWW 300 

VlLriTHFIiA AGRKMSCEAI BQKAVWPHAV AWGTPGFItTV MLLALHKtfaG DMISGVCFVQ 3 60 

I.yi>U:ASRyF VLLPLCIiCVP VGLSIjZ«LAOI ISIAmVROVI QBDGRNQEKL KKTMIRIGVF 42 q 

SGLYIiVPLVT LLGCYVYEQV NHITHBITHV &DRCRQYHIP CFYQAKAKAR PELALFKXKT 480 

IiMTIiIVaT&A VFWVG&KKIC TEMftGFFKRN RKSDPIfiESR KVUQESCBPV LKHSSKVJKHK 540 

KKHYKFSSHK I»KVISKSK(3T STQATANHGT SAVAITSHDY LOQETE/rBlQ TSPETSMREV 600 

KADaAJ5rFPRI> REODCGEPAS PAASISRLSG EQVDOKGQAG 8VSESARSS0 RI5PKSBITD €60 
TOUiQSinStLQ VP8SGEP8SL KG9T9I»IiVHP VS6VRKBQGG GCHSDT 

seq ID HOs 497 DKA agquence 
Nucleic Acid Accession #s HK_005046 
Coding sequfince: 16.. 777 

1 11 21 31 41 51 

I I I ' < ' 

G6ATTTCOGG GCTOCATQGC RAGATCCCTT CTCCrOCCCC TGCRGATCCT ACPGCTATCCr €0 
TTA0C3CTTGG AAACTGCAGG AGAAGAASCC CAGQGTGACA AGATTATTGA TGGC3GC30CCA 120 
KSIGCMGMJ GCTCCSCACCC ArcGGChGGTG GCOCTGCTCS^ G3X3GC2\ATCA GCTCCACIOC 180 
aOAflaCGnrCC ^tGGTCAATGA GCQCTGG6IG CTCACIGOOG CCCACTQOUk <3ATGAAT6A6 240 
TACKOCSSTGC ACC3X3GGCIU3 TGAXAC6CTG GGG6ACA9GA a2U3CTC2USA6 GATCAACK3CC 300 
TOOAAQTCAT TC0GCC3kCCC OGGCTACTGC ACaClRG?iCX:C ATGTTAATSA CCTCArGCTC 360 
GTGAAGCTCA ATAatXAOOC CAGGCT<JTCA TOCATC30TC3A AjGRAJVGTCAG GCTGCCCTOC 420 
OGCTGCQAAC CGCCTGGAAC CACCTGTACT GTCTCGSQCT GGGGCACXJkC CAiGGAGCGCA 480 
GATGIGhCCT TTC0CTC3OA CCCCaflCGTGC GTGGATOTCA ftaCTCaTCTC CCOCCRGSAC 540 
TGCAC3QAAfiS TTTACftAOCSA CTTACraOAA AATTCCATCC TGTGOGCTGQ CATOCCJOGAC 600 
TCCAAGAAAA A03CCT<3CAA TOGTGACTCA GGGGGAOOOT rOGTGTGCAG AGGTACCCTQ 660 
CAAGSrrCTQC ^iCCrGGGG AACTTTCCCT TQCQGCCBAC CCAATGAjCCC AfOGAGTCCAC 720 
ACICAAGTGT GCAA0TTCAC CAAGIGGATA lATGACACX?^ T^AAAAAGCA TO8CXAAO0C 780 
CACACIGAfiT TAA7XAACTG TGTGCTTOO^ AtiASAAAAIG CACA6GAOTG AOGAOSCGGA 840 
TOJUSCCATGft. AGTCAAATTT GRCTTTAOCT TTCCTCAAAO ATAXATTTAA AOCTCATGCC 900 
CTGTTOKrAA liCXSiXrCMA TT8GTAAA6A CCS^AAAAGCA AAACAAATAA ASAAACACAA 960 
AAOOCTCAA 

seil UJ 390 s 498 Protein gequence 
Protein Accession #s HP_005037 

1 11 21 31 41 51 

I I I I I I 

HARSIiLIf liQ XIiUiSlAXiKir MSBBAQCSDKI XDQAPCKBGS HPHQVAUiSG VaXiHGGGVLV 60 
MBSnVLTAAH CKMNEZTVSIa OSDrXiODBRA QRIKABRSFJt HPSYSTQlHV MDLMPVICIJgg X20 
QARLSSHVKK VRIf&RCB^ GTICTVS6NG 'rrjLaPLl V i'FP 8DLHCVX»PKL l&FQDCTKVY 180 
KDIiLENSHLC A6XH)3KR1IA CHODfieGPLV CRGTLQGLVS WGTFPGGf^ 240 
FTKSfmDXMK KUR 

Se<i ID SOt 499 IMA Bequenoe 
KiiclelG Acid AcceBBlon #1 hM_Q07196 
Ocdlng seqiiences 182,, 962 

1 11 21 31 41 51 

GTTCCCAGAA GCTCCGCAGG CTCnuQTGOA GGAGGAGAA6 GAGGAGGAGC AGQAJQQTGGA 60 

QATTCECAQT TAAAAQGCTC CAGRATOOTG TACCAGGCAG AGAACraRRG TRCTGGGGCC 120 

TCCTOCACXG GGT0C5GAATC AOmSGOXSAC CO0QCCC3CTG GATTCTGGAA GACCICAOCA IBO 

*DGaaA£I3CCC GOQACSCrcer GGGGCCAAGA GGTGGAIGTT OCTGCTCITG CTG0GGGGA6 240 

GCTGGGCnQO ACACTCCAGQ GCACAOGAOG ACAAGGXGCT GGGOGGTCAT QAGOGCCAAG 300 

OCCavrlCGCA GOCTIGGCS^ GOGGOCTTGI? TCCAGGGOCA OCAACIACTC TGTGGCGGTO 360 

TCX^rrGTAGG TGGCAACTGG GTCCTTACAG CTGCCCACra TAAAAAACOG AAATAiCACAfl 420 

TAOaCCTOOQ AOftCCACAGC CIACAGAATA AAGATGGGOC AGAGCAAGAA ATAGCTGTGG 480 

TICAGTOCftT CCX31CMeXX:C TGCTACAACA GCSUKBAIGT GGAGGACCAC AACCATGATC 540 

TGATOCTTCr TCAACTGOQT GAQCAJGGCAT CSOCTGGGGTC CAAAGTOAAG OCCftTCftSOC £00 

nOGCAGATCA TTGCACCCAG CCTG&CCaSA AGTGCAOOST CTCAOGCTOG GaCACTGTCA 660 

CCAQTCCCCX5 AORGAATTTT C3CTGACACTC TCAACTGTGC AOAACTrAAAA ATCTTTCCCC 72 D 

AGAAtSAAGTG TGAGGATGCT TACCOGGGGC AGATCACASA TGGCATGOTC T6TGCASGCA 780 

GCA)GC3UUUaG GGCTQACAm 1GC3C3VSGGGQ ATICIGGAGG CGCCCTGGTG TGT qATG qTQ D40 

GhClXXZAGGS GKTCACATOC TGGGGCTChG AOCCCTOTGG GftGGTOCQAC AAACCIGG06 900 

TCTAITAlCCAA. CATCiaCOSC XACCTGGACr OGATC2VAGAA GATCaVTAGGC AGCAASGGCT 960 
GATXCTAOGA TAAGCACTAG ATCTCCCXTA ATAAACTCAC AACTCTC 

Seq ID NOx 500 Protein sequence 
Protein Accession #; HP_009127 

1 11 21 31 41 51 
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MGRPKPHAAK TWKFIiLiLGG AWAGHSRAQB DKVLGC3HE0Q PHSQPWQAAI. FQGQQIiIiOGG 60 

VliVGCSNWVLT AAHCKKPKyT VHWSDHSLQN" iOXSPEQEIPV VQSIPHPCYN SSDVKDHNHD 120 

IiMljLOlaRDQA aiiGSKVlCeiS IiADHCTQPGQ KCTVSGWGTV TSPREWFPDT LNCAEVKIPI* 180 

QKBCCBDAYP6 QITDGMVCAG 3SKC3ADT<3QG PSGGPIiVCDG AIjQGITSHGS DPC33RSDKPG 240 
vyTHICfiYbD HXKKIIGSKS 



Seq IP KOi SOI DMA aecruence 
Nucleic Acid Accession 11M_OOe3.03 
Coding sequence: 



1 11 21 31 41 SI 

11)111 

CftCCTGCACC CCGCCCGGGC ATAGCACCAT GCXrTGCTTGT CQCXTTAQGOC CGCTAGCOGC SO 

OGCCCTCCTC CTCaocCTGC TGCTGTTGQG CTTCAIXICTA GOOTCAGaCA CAGQAGCftfiA 120 

GZUUSAQIGGC GTGTGOCCOO AGCTCCAfSQC TC3ACCAGAAC T(3C3UDaCAA& ABTGCSTCIC LBO 

GGACAG03AA K3C3GCCGACA ACCTCRAGTG CTGCW3CGCG GGCTGTGCCA CCTTCTGCCT 240 

TCTCTGCXXa. AAHGATAAOO AGt3t3TTCCTQ CCCCCAGGTG AACATTAACT TTCCCCAGCT 300 

OGGCCaraCGT COaOACCAGI GCCAGGTGGA CAGCCAOrGT CCTGGCCAGA TGAAATOCXG 360 

ccocaaiGtac tgtgggaagg TCTCcrGTor cactcccaat ttctgagotc cagocaccac 420 

CAGGCTGAlGC AOTGAGOAGA GAAAGTTTCT GCCTGGCCCT GCATCTG6TT CCAGCGCACC 460 

TOOCCTCCCC MTTTOGGGA CTCTGTATTC CCTCtTGGGC TQAOCACAGC TTCTCCCTTT 540 
CCCAACCAAT AAAQTAACCA CTTTCASCAA AAAAAAAAAA AAAA 



Seq ID HO I 502 Protein aequence 
Probexn Accession #£ iip_QD£094 

1 11 21 31 41 51 

I I I 1 I I 

MPACRLGPIA AAliUiSIiIiIiF GFTLVSSTGA EKTGVCPELQ ADONCTTQECV SDSECADSHjK 60 
OCSAOCATFC LLCSHDKBGS CPQVNZNFFQ XiBLOaQQCQV DSQCPOQHKC C8NGCGKVSC 120 
VTfKF 



Seq ID NO: 503 PNA Hequence 
Nucleic Acid Accession IIM_002407 
coding Beguence: 65>.352 

1 11 21 31 41 SI 

I 1 1 i I 1 

CCTCCACAGC AACTTCCIT6 ATCCCTGCCA CX3C2^C6ACTO AACACAlSACA GCAGCGGOCT 60 
GGCCATGAAO CTBCTOATGG TOCTGATGCT GCTOOQCCCTC CTCCTGCACT GCTATGCAGA 120 
TTCTGK3CTGC AAACTOCTG6 AfSQACATOOl UGAAAAC^CC ATCAATTCCG ACKTATCTAT 180 
ACCTGRATAC AAAGAGCTTC TTCAAGAGTT CATAGACAGT GATGCC3GCTO CAGftSGCTAT 240 
0(S3<3AAATTC AAGC2WTGTT TCCTCAACCA GTCACATAGA ACTCTGIRAAA ACTTTGGJiCT 300 
GAIGATQCAT ACAjGTGTACO ACAiQCATTTG tSTGTAATATO AAGAGtZUVIT AACTTTACCC 360 
A»30OaTTTB GCXCMSbOGG CTACAGACTA TOGCCACavAC TCATCTOTTQ ATIGCTAGAA 420 
AOCACTTTTC TTTCTTOTOT TGTCXTTTTA TGIGGAAACT QCIAGACAAC TGTIGAAACC 480 
TCAAATTCAT TTCCATTTCa ATAACTAACI GGAAATC 



3eq XD UOs 504 Protein geouence 
Protein Acceaeion #i llP_00239a 

1 11 21 31 41 51 

I I I 1 I I 

KKI^lKVIiHEkA AIiLUICXADS GCXLI£DHVE KnHSDlSIP EYKEIiUQEFI DSDAAAEftMG €0 
SFHQCEUIQS HRTLIOIFGLH I«TVYD9IHC NMKSN 



Seq ID NOs 505 UNA sequence 

ElUcleic Acid AcceBaioo. NM^014791*1 

CSodlng sequences 171.. 2126 



1 3L1 21 31 41 51 

111111 

TTGGCGGGOQ GAAOCGQCCa CAACCCGGOG AT0GAAAa<3R TTCTTAGGRA CGCCGTACCA 60 

aCCQO&SCiC TGAGGACA6C AQ3CCCCTGir CCXTCTGT09 GOOGCOGCIC AGCOSTGGCC 120 

TCOGCCOCTC AGGTTCTTTT TCTAATTCCA AATAAACTTQ CAAGAOGACT AT GaAftS ATT IBO 

AIGATGAACT TCTCAAATAT TATOZ^ATTAC AIGIUUUITAT TGOGftCASSlr GGCTTTGCAA 240 

AGGTCAAACr TGCCTGCXSCT ATOCITACTG GASAQATCGT AGCTATAAAA ATCATQGATA 300 

AAAAfZACACr AGGGAGTGAT TTSCCCCiGGA TCMUVAOGGA 6ATTGAGGCC TTGAA6AAGC 360 

aX3AGAlCATCA GCATATATiQT Gftl^CTCTAGC ATGTGCTA0A GACAGGCAAG AAAATATTCA 420 

TGGTTCITQA GTAXMCXOT GeWSOAOftGC TOTTTGACIA TATAATTTOC CAGGATCSOC 480 

T6TCAGAM3A OGAQAOCXZOa GirGTCTTGC 6ICAGATW3T ATCKSCrGXT GCTTATGTQC 540 

ACAGCCAQOG CItAlGCTCAC AGGGACCTCA AGCCAOAAAA TTTGCTOTTT OATGAATATC 600 

ATAAATTAAA GCTGATTGAC TTTQGTCTCT GTGCAAAACC CAAGGGTAAC AAGGATTACC 660 

ATCTACAQAC A!E6CTGIGGG AGTCTGGCTT ATOCftGCACX: TGAGTTAATA CAAGGCftAAT 720 

GASKTCTTGO A1CAGAGGCA GATGTTIGGA GCATGGGCAT ACTGnrHAXAX GTTCTTATGT 780 

GTOGRTTTCT ACCATTTGAT GATGATAATG TAATGGCTfT ATACMGAAG AXXAXGAGAG 840 

GAAAATATGA TGTTCCCAAQ TQGCTCTCTC OCRGTAGCAT TCTGCTTCTT CAACAAATQC 900 

TGCAGGTGGA CCCAAAGAAA OGGATTTCTA TGAAAAATCT ATTGAACCAT GCCTGGATCA 960 

TGCAAraiTrA CAACTATCCT GTTQAGtGGC AAAQCAAOAA TCCTTTTATT CACCIGOATG 1020 

ATGSKTTGOGr AACAGAACIT TCTGIACATC ACAGAAACTA CAOaCAAABL AXGOAGOATT 1080 

TAATTTChCr OTGOCAGTAT GATCACCICA CaaCTAGCrA TCXTCIGCIT CTAGCCAAGA 1140 

AGGCIGGG6G AAAAOCAGTT CGTTTAAOGC! 'ITTCTTCTTT CTTCSCTOXGGA C31AGCCAGTQ 1200 

CTACXrcaXT CACMSACATC AAOTCAAASCA ATTGGAGTCT GGAAGATGTG ACCGCSWGXG 1260 



X149 
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AIAAAAKTTIV TGTGGOQGGA TTA&TAGACT ATGflTTCWTG TORAfiftTGAT TTATCAACAG 1320 
OTGCTGCTAC TCCCCXSAACA TCRCAGTTTA OCAAOTACTG GACRQAATCA AATGGGGTGG 
AATCTAAATC ATTAACTCCA QCCTTATQCA GAACACCTGC AAATAAATTA AAgAACAAAg 1440 
AAAATGTATA TACTCCTAAG rCTGCTGTAA AGRATGAAGA GTACTTTATG TTTCCTGAGC 1500 
CAAA6ACICC AGTTAATAAG AACCAGCATA AGAGAGAAAT ACTCACTAOff CCRAATCXSTT 1560 
ACACTACACC CICAAAAGCT ASAAACChGT 6CCTQAAAGA AACTCCAATT AAAATACC2\G 1620 
TAAA-TTC3^AC A6GAACA(3AC AAGTTAATQA CRGGTGTCRT TJtfSCCCTGAG AiaaCQOTGCC 1680 
GCTCAGTGGA ATTGGATCTC AACCAAGCAC ATATGGAGQA GACTCCAAAA AGAAAfiOGAG 1740 
CCAAAGrGTT TGGGAGCCTT GAAAGSGGQT TOGATAAGGT TATCACTGTG CTCACCA<3GA 1600 
GCAAAAGGAA GGGTTCTGCC AGAGAOGGGC OCAGAAOACT AAAGCTTCAC TATAATQIXSA 1B60 
CIACAACTAS ATTAjSTGAAT CCAGATCAAC TBTTGAATSA AATAATGTCT ATXCTTOCAA 1920 
AGAAGCATGT TGACTTTGTA CAAAAGGGTT ATACACTQAA QTGTCAAACR CAGTCAGATT 1980 
TTGGGAAAGT OAGAATGCAA TTTGAATTAO AAOTGTGCCA GCTTCAAAAA CCCGATGTGG 2040 
TQGGXATCAG GAGGCAGOGG CTTAAGGGCG ATGCCTGGGT TTACAAAA£3A TTAGTGGAAG 2100 
ACATCCTATC TAQCTGCaUMS 6TATAATTGA TOQATICTTC CATCCTGCCH QAT GAGT GTG 2160 
GGTtTEGATAC AGGCTACATA AAORCTOrTA TGATOGCTTT GATTTIAAIUS TTCATTOGAA 2220 
CTACCAACTT GTTTCIAAAG AGCTATCTTA AiOACCAAVAT CTCTTTGTTT TTAAACAAAA 2280 
QATATTATTT TX3TGTATCAA TCTAAATCAA GCCCATCXGT CATTATGTTA CTGTCTTTTT 2340 
TARTCATGTG GTTTTGTATA TTAATAATTG TTQACTTTCT TAGATTCACT TCCATATGTG 2400 
AATGTAAGCT CTIAACTATG TCTCTTTQTA AOIGTGTAATT TCTTTCTGAA AIAAAACCAT 2-4^0 
TTGTGAATAT 

Seq ID NOs 506 3?rotein. sequence 
Protein Accession tt: HP_055fiD6.1 

1 11 21 31 41 SI 

I I 1 I i I ' 

MTmvngT.T.gy YEUtreTIGTG GFAKVlOiACH ILTGEMVAIK IMDKHTLGSD UPRIKTBlEA 60 
IiKNIiRHQillC: Qli^MVLETAN KIFMVLEYCP GGELFDYIIS QDRIiSEEETR WPHQIVSAV 120 
AYVH&QGYAH RDLKPSHIiIiF DEYHKLKLID FGI>CAKP1QQN KDYBXjQTCGQ SIAYAAPEIil 180 
QGKSyXiGSSA DVWSMGIIiLY VLMGGFIiPFD DDNVHAZrYKK XHBOKXDVPK WLSPSBIIiUI* 240 
QQMIiOVDE>^ RISKK»IUIH FWIMQiDYllYE VEWQ9KHPFI HLDDDCVTEL SVHHRNNRaT 300 
MEDLISXiWOY DHDTATYLliL LAKBCARGKPV RLRLSSFSOS QASATPFTDI KSMUWeiiB3?V 360 
TAfiDKHYVAG LIDYDWCEDD LSTQAATPRT SQFTKyWTBa N6V£SICfiLTP ALCRTPAHKli 420 
KHKEIIVYTPIC SAVKHEEYPM FPEFKTFVKK NQHKRBILTT SNRYTTPSKA RI2QCLKETPI 480 
KIFVSI&IXSTD KLMTGVIQPB RRC3RSV£LDIk MQAHHBBTPK RKSAKVFGSL EROU3KVXTV S40 
LTRSKBKS3SA RDGFRRI.KLH YKVTTTRIiVH PDQLtiHSIMS ILPKKHVDFV QSQSYXLKCQT GOO 
OSDPGKVTMQ PELEVOQIKJK PDWGISfiQE LRGDAWVYKR tiVEDlLSSCK V 

Seq ID MOs . S07 DMA sequence 
Suclelc Acid Accesflion. #v HH_0005B2 
Godi.xii9 Beguences 88., 990 

1 11 21 31 41 51 

I I I I 1 I 

GCAGAGCACA GCATGGTCOS GACCAGACTC OTCTGASSOC AGTTOCaGCC TXCICA6GCA GO 

AACGCCGACC AAGGAAAACT CACTACCATG AGAATTGCAO TGATTTGCTT TTQCCTCCEA 120 

GQCATCACCT QTGCCATAOC AGTTAAACM QCJIGATTCTG GAAOTTCTGA GC3AAAAGCAG IBO 

CTTTACAACA AATACCCAQA TaCTGTGGOC ACATGGCTAA ACCCTGAOCC ATCTCAGAAS 240 

CS^SAATCTCC TASCCOCACA GACC3CITCCA AG12U«3TOCA ACSQAAftQCCS^ TOACC ACATG 300 

GATGATATG6 ATGATGAAQA TQATGAXGAC CATOTOQAOk GGCAGGACTC CATTaACrCS 360 

AACQACTCTG ArTGATGTAGA TGACACIOAT GATFCTCAOC: AGTCTOATGA GTCTCAC CAT 420 

TCTGAHGAAT CTGATORACT GGTCACTGAT TTTCOCACX3G ACCTGCCASC AAC3CSAAGTT 480 

TTCACTCCAG TTGTCOCCAC AGTAGRCACA TATGATGGCC GAOTTGATAG TGTGCTTTAT 540 

GSUITGAGGT CAAAATCIAA GAAGTIIOGC AiGACSCTGACA TOCA6TACOC TGATGCTACA 600 

GACGAOGACA TCAOCTCACA CATOGAAAlGC GAGGAjGTTQA ATGGTGCAXA CAAGGCCATC 660 

CGOGTIGCCC AGGiACCTGAA GGCGGCTTCT GATTGGGAiCA GCGGTGCaiSUl GGAjCAGTTAT 720 

GAAACGAGTC AGCTGGATGA CCACaftOTGCT GAAACOCACA GCCACAAGCA GTOCAGATEA 760 

TATAAGCGGA AAQCCAATGA TGAGAGCAAT aAQCATTCX3G ATGTGATTGA TAQTCAGGAA 840 

CTTIOZAAAG TCAGOCGTGft. ATTCCAC3iGC CATGAATTTC ACaGCCATGA AGATATGCTQ 900 

GTrOTAQACC CCAAAHSlAA. GGAAOAASAT AAACftCCIGA AKSTTCOTKr TTCICRIGAA 960 

TTA&ATAGTG CATCTTCTGA GGTCAATTAA AAGGAGAAAA AAITACAAXTT CTCftCTTTGC 1020 

ATTTAGTCAA AftGA3U\AAAT GCTTTATAGC AAAATGAAAG AGAACATGAA ATGCTTCTTT 1080 

CTCRGmAT TGGTTGAAT6 TGTATCTATT TGAGTCTOGA AATAACIAAT QTOTTTGArA 1140 

ATTAGTTTAG TT1X3TGGCTT CATOQAAACT OCCTGTAAAC TAAAAGCTTC A80GTTATGT 1200 

CTA TO TTCRT TCXMTAGAAe AAATGCAAAC TATCACTGIA TTTTAATATT TGTTAIXCXC 1260 

TCAT6AATAG AAATT7ATGT AGAAGCAAAC AAAATACTTT TAOCXACTTA AAAAGAGAAT 1320 

ATAACATTTT ATGTCACrAT AATCTTTTGT TTTTTAAOTT AGTGTATATT TTQTTGTGAT 1380 

TATCTTTTTG K3GTGTG7AT AAATCTTTTA TCTTCAATGT AATAAGAATT TGSTGSTGTC 1440 

AATTGCTTAT TTOTTTTCCC ACGQTT6TOC AGCRA1TAAT AAAACATAAC CTTTXTTACT 1500 
GGCTAAAAAA A2»AAAAAAA AAAA 

Se<2 ID NOs 508 Protein segnence 
Protein Accesaxcm ft; Kn?_O0D573 

a 11 21 31 41 51 

1 I 1 I 1 * 

HRIAVICPa. LQlTCAIPVK QADSGSSHEK OLYMBTEDAV ATHLNPDPSQ KQNIiIAPQTI* SO 
PSK5HESHDR ^/IDISMDDEDDD IJUVDGQDBXP SHIDSDDVDDT DDSHQSDE3H H5DG8DELVT 120 
DPPTDLPATE VFTFWPTVD TYDGKGDSW YGEiRSKfiKKF RRBDIQYPBA TDEDITSHHS 180 
eSEUOCSAYKA IFVAlQpLHAP 8DWD8RGKDS YETSQIflDQS ABTHSElQQaiL ItYlOUCAHDES Z40 
KEHSDVIDSQ BL8KVSREPH SHEFHSEEDK LVVDPK6KBB DKHLKFKIGR BLDSASSBVH 
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PCT/US02/36810 



Seq rt) MOs 509 DMA sequfeQce 
Kuclelc Acid Accession #: ABO51390- 
Codlng sequence: 34.. 2457 



3 1 II 21 31 41 51 

I 1 I I I 1 

JUSCGQCOSCG GCACAAAGTT GGGGGCXXKjG AAGATGAGGC IGTCCXlCGGC GCCCCTGAAG 60 
CTOAGOCGGA CTCCQQCACT GCTGGCCCTG GOSCTGCGCC TGGCOGCGGC GCTOQCCXTC 120 
^ TOaSACQftGA CCCTGGACAA AGTGCCCMG TChGftSGGCT ACTGCAtSCCG TATCCTGOSC IBO 
11) GCCXAGGGCA CGCQGCGCGA G(3C3CTACACC CSAGTTCAfiCC TCX^GCGTGGA GGGCOACCCC 240 
GACTTCTACA AGCCQGGAAC CftGCTACCGC GIAACRCTTT CAGCTGCrrcc TCCCTCCTAC 300 
TTCAGAGGAT TCACATTAAT TGCCCTCACA GAGAACAOAB AdGGGTGATAA GGAA6AAGAC 360 
CATaCTOCSGA CCTTCC34I&AT CATAGAOGAA GAAGAAACTC AGTTTATQAS CAATTGCCCT 430 
^ ^ GTTGCAGTCA CTGAAAGCAC TOCACGGAGG AGGACCOOQA TCCAGGtGTX TTGGATAGCA 4B0 
15 CCACCaOCQQ SAACAGGCTG CGTGATTCTG M.GGCCRGCA TOGTACRAAA ACGCRlTATT S40 

TATTTTCAAG ATGAGGGCTC TCTGACCAAG AAACTTTGTQ AACflAfiATTC CACATTTGAT 600 
GOaGTGACTQ ACAAACCCAT CTTAGACTGC TGTGCClGCG GAACTGCCAA GTACAGACTC 660 
ACKmrXTG GGAATTGGTC GQAQAAGACA CACCCAAAGQ ATTACCCrGG TGGGGCCAAC 720 
CACIGGTCTG COATCATCdB AGGATOCCAC TCC&AGAATT ATGTACTOTa GQAATATOGA 7a0 
20 GGATATOCCA GCGAAGGCC3T CTiAACAAGTT GCAGAATTGG GCTCACXX3GT GAAAATGGAG 840 
GAAGAAATTC QACRACAGAG TOATGAGGTC! CTCACOSTCA TCAAAQCCAA AGCXKAATSG 900 
CCAGCCXGGC AGCCTCTCAA CGTGAGAGCA GCACCTTCAG CTQAATTTTC CGTGGACAGA 960 
AC60GCX&TT TAATOTCXrrT CCTQACCI^TO ATGGSCCCXa^ GTOOOGACTG GAACGTAGGC 1020 
TTATCTGCAG AAGATCTGTG CACCAAGGAA TGTGGCTGGO TCCAGAAGGT GGTGCAftGAC lOBD 
25 CTQATTCCCT GGGACGCTaa CAGCGAt^AGC G6GGTGACCT ATGAGTC31CX: CTIACAAACCC 1140 
AGCATTCOCC AG6AGAAAAT CCGGCCCCTG ACCAGCCTOG ACCMCCTCA GA6TCCTTTC 12 00 
TATGACCCAG AGG6TGGGTC CATCACICAA GTAGOCAGAG TT6TCATCQA GA6AATCGCA 1260 
GGGMVGGGTG AACAATGCAA TATTGTAOCT QACAA.TGTCG ATGAXATTGT AGCTOACCTG 132D 
GCrCCAOAAG AGAAAGATGA AGAT6ACACC CCTGAAACCT GCATCTACXC CAACTGGTCC 1380 
30 CCATGC3TC0G CCTGCASCTC CTCCMK^TGT QACAAAC3GCA AGAOGATGOG ACanCSHCATG 1440 

CTGAAAfiCAC AGCTGGACCT CA6C6TCCCC TGCCCTGACA CCCAGQACTT CXaOCOCTGC 1500 
ATGGGOCCTG GCTGCAGTGA CGAAQACGGC TCCACCTGO^ CCATGTGGGA GTGGATCACC 1560 
TtaQICQCCCT 13CAGCATCTC CIGOGGCATG GGCATGAG6T CCCGG(?IVB2UB GTAT61N3AAG 1620 
CT^TCOGOG AOQACGGCTC OGTQTGCACG CTQCCCACTG AGGAAAGGGA GAAlGTaCACXS 1680 
35 GTCAACGAGG AGTGCTCTOC CAGCAGCTGC CTGATGACOG AGTGGGGCQA GTGGGAGGAG 1740 

TGCAGCGOCA CCTGCGGCAT GGGCATQARG AAGCQGCACC GCATGATCAA GATQAACCazc ISOO 
GCAC3ATG(3Cn: CCATGTGCAA AGCGGAGACA TCACAGGCAG AGAAQTGCAT GATGCCAGAG 1860 
TGGCACACCA TCGCATGCTT GCTCmXXX:A T8QTC08AGT GGAGTGACTG CAG0GT3ACC 1920 
TOaStSBMOXS GCATQOBAAC CC3GACAGCX56 A1GCTC3UU5T CTCTGGCAia ACrCGGHUSAC 1980 
40 TGCAATQRGG ATCTGGAGCA QGTGGAGAAa TGCATGCTCC CTGRATGCOC CATTGACTGT 2040 
GAGCTOACOG AGTGOTOCCA GflQGTCGGRA TGTAACAAGT CATGTGGGAA AGGCCACGITG 2100 
ATTC3GAACCC CGATGATCCA AATOGAGCCT CAGTTTGOAG OTGCftCGCTG CCCAGAGACT 2160 
GTGCAGOQAA AAAAGTGCCX3 CATCOGAAAA 1X3CCTT06AA ATGCATO^T CCAAAAlGCrrA 2220 
CX3G?:GGM3GG AGGCOCGftGA GA6CCX3G066 AGTGAGCAlGC: TQAAiGGAAGA GTCTGAAGGG 2280 
45 GAGCAGTTOC CAGGTTGTAG GATGCQCCCA TGGACGGCCT GGTCAGAATG CACCAAACTG 2340 
TQCiaSRGGTG GAATTCAGGA AOGTCACATG ACTGTAAASA ACAGATTCAA AAGCTCCCAG 2400 
TTTACCAGCT GCAAAGACAA OAAGGAQATC AGAQCATGCA JVrGIICATCC TTGTTAGCAA 2460 
GGGIAG6IU3I TCCCCAGGGC TGCftCrCTAG ATTCCAGAGT CACCAAnCIGC tEGQATTATTT 2520 
_^ GCTTOTTTAA GACAATTTAA ATTGTO3ACS CTAQTTTTCA TTTTTGCM3T GTGQTTOOCC 2580 
50 CAGXAGTCTT GTGGATGCXZA GAGRCATCCT TTCTGAATAC TTCTTQATGG GTACAGGCTG 2640 
AGTGOGGOGC CCTCACCTCC AQCCAiOCCTC TTCCTrGCAGA GGAGTAGTGT CAOCCACCTT 2700 
GIACTAAGCT GAAACATGTC OCTCTGGAGC TTCCaCCrGG CCAGGGAI3GA OOGAGACTTT 2760 
GACCTACTCC ACATGGASftB aCAACOLTGT Cr&SMHSSGK CTATGCCTQA OTCQQUSSGT 2820 
GGSGCftSGTA GGAAACATTC ACAGATGAA6 ACAGC3U3ATT C3CCCACATTC TGATCTTTGQ 2880 
55 OCTOTTCRAT QAAACCATTG TTTQCCCATC TCrTCTTAGT QGAACTTTAG GTCTCTTTTC 2940 
AAGTCTOCTC AGTCATCAAT A6TT0CT0GG GAAAAACAOA GCrGGTAGAC TTGAAGAGQA 30 DO 
aCATTGATOT TGGGTGaCTT TTOTTCTTTC ACTGAGAAAT TOGGAATACA TTTOTCTCAC 3060 
CCCTGATAIT GGXTGCTGAT GOGCCCGCAA CAAAAATAAA TAAAZAAAXT jOGGCXGCTT 3220 
^ TATTTAAniA TAAGGTAGCT 2USTITTTACA CCTGftGAIAA ATAATAAGCT TAQAGTSTArT 3180 

60 TTTTCXX^^TG CTTTTGGGQG TTCAGACGAG TATOTAC31AT TCXTCTGGGA AGCCAGCCTT 3240 
CTGAACIITT XGGTACTAAA TCCTTATTGG AACCAAGACA AAGaAAG<::AA AATI&GTCTC 3300 
TTTAGA6ACC AATTTGOCTA AATTTTAAAA TdTCCTACA CACATCTAGA CSTTCAAGTT 3360 
TOCAAATCAG nXTTAGCAA GAMUbCATTT TTGCIATAC^ AACATTTTGC TAAGTCTGCC 3420 
CftAAGOCCOC CCAATQCATT CCTTCAACAA AATACAATCT CICmCTTTA AAGTTATTTT 3480 
65 AGTCATGAAA TTTTATATGC AG2VGAGAAAA AGTTAOOGAG ACAISAAAACS^ J^CTAAGOG 3540 
' AAAGGAATAT TATGGGATTA AGCTGAGCAA GC3VATTCTGG TG(sAAAGTCA. AAGCTGTCAG 3600 
IGCTCCaCAC CAGGGCIGTG GTOCTCCCAG ACATGCATAO GAATGaCCRC AGGTTTACAC 3S60 
TGCCITCCCA OCAATTATAA QCACSUXIAGA TTCAOOGnGA CIGAOCACCA AOOCnTAerG 3720 
TAAAAGGACA TTTTCTCAGT TGGGTGCATC ASCAGTTTTT CTTCCTBCAT XTATTGTTGA 3780 
70 AAACTATTGT TTCATTTCTT CTTTTATAfla OCTTATTACT OCTIAATCCA AATGTGTACC 3040 
ATTGGTeaQA C&CATACAAT GCTCTQAATA CRCTACGAAT TTGrATTAAA CACRTCAGAA 3900 
TAT7TOCAAA TACAACATAG TATAGTCCTG AATATGTACT TTTAACACAA GAGAGACTAT 3960 
TCAATAAAAA CTCACTGGGT CTITCATGIC TTTAAGCZAA GIAAGrGTTC ASAAGGTTCT 4020 
TTTTTATATT GTOCTCCSVCC TCCATCATTT TCAATAAAAG ATAGOGCTTT TGCTCCJCTTG 4080 
/5 TTCriGSAGG GACXZATTATT ACATCTCTGA AJCTAjCCTTTG TATOCAACAT GTTTTAAATC 4140 

CTTAAATGAA TTGCTTTCTC CCAAT^AAAAG CACAATATAA AC3AAACACAA GATTTAATTA 4200 
f tllTCTACr TGOOGGQAAA AAAGTCClTCA TGTAGAAQCA CCCACTTTTQ CAATBTTGTT 4260 
CEAAGCIATC TATCTAAGTC TCAGCCX9VTG ATAAAGTTCC TTAAfiCIGGT GATTOCTAAT 4330 
CAAGGACT^AG CCAOCCTAlQir GTCTCATGTT TGTATTTGGT OCCAOTTGGG TACATTZTAA 4380 
oO AATOCTGArr TTGGAGACTT AAAAOCAGGT TAATGGCTAA GAATGQ6TAA CATGACTCTT 4440 
GTTaOATTQT TATTTTTTGT TTGCAATGGG GAhTTTATAA GAAGCA.TCAA GrCTCTTTCT 4500 
TACCAAACnXr TTGTTAGGTG GTTTArrAGTT CTTTTGGCTA ACAAATCATT TTGGAAATAA 4560 
ASATTTTTTA CTACAAAAAT O 
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Seq ID NO: SIO Protein sequence 
Protein Accession #s BABlB4ei.l 

1 11 31 31 41 51 

I I I I I I 

MRLSPAPIiZO. SRTPALUUiK U lAAKLAFS DEITLDKVPKS BGyCSRlLRA OGTRREOyTE 60 

FSZiRVEGDPP FYKPGTSYKV TliSAAPPfiYP RGFTLIALRIS NaHGDKBBDH AGXFOXIDSB 120 

ETQFMSNCPV AVTESTEHEH TOIQWFHIAP PAGTGCVJLK ASIVQKRIIY PQDEGSI,TKK ISO 

liCEQDSTEDG VTDKPILDCC ACGTAKYRLT FYGNWSEKTH PKDYPKRANH HSAIIGGSHS 240 

KNYVIi^YGQ YASEC?VKaVA ELGSPVKMSB EIRQQSDEVI. TVIKAKAQWP AWQPLNVRAA 300 

PSAEFSVDRT aHLMSFLTMM GPSPDWMVGl* SAEDICTKEC GHVQKWQDIi IPHDAGTDSG 360 

VTYBSPNKPT IPQEKIRFKT SlitfflPQSPPY DPEGGSITQV ARWIERIAR KGEQCNIVPD 420 

KVDDIVADLA. PEEKDEDDTP ETCiySNWSP W9ACSSSTCD KGKRMRORML KAQLDIiSVPC 480 

PDTQDPQPCM GPQCSDEDQS TCXMSEW2TW SPCSISCX3HG MRSSSRYVICQ FPBDGSVCTr. 540 

PTEETEKCTV WEE5CSPSSCL MTEWG^TOBC SATC3CTJGMKK HHRMIKMBJPA POSMCKAEIG 600 

QAEKOOtPEC HTIPCIiI»SPW SEWSDCSVTC GKGMRTRQFK IiKaiiAELGDC HEDLEQVEKC 660 

MLPfiCPXDCB LTBW8QWSEC NK8CGKGHVI RTRMZQMEPQ FOGAPCPBTV QRKKCRXRKC 720 

liSKPgiQKZA HREARBSSRS BQUCEEfiSGB QFPGGRMRPn TANSSCIKLC GGGIQEKYMT 780 
ViOCBFKSaQF TSCKDKXBIR ACHVHPC 

Secj ID £10: 511 DMA sequence 

Uuclela Acid Acceesloa ttx NH_0 03106.1 

Coding sequence » 76.-1401 



1 11 21 31 41 51 

! I 1 I I I 

GGGOTGGGAG GGGGAGGOGG ACCTCOGCAC GAGACCCAQC GGCCOGGGTT GGAOOGTCCA 60 

GCCCTGCAAC GGATCATQOT QOWSCaiQGCG GAGAGCTTGG AAGOGGAPRQ CAACCTGCCC 120 

CGGGAGGC6C TGGftCAOGGA GGAiGGGOGAA TTCRTGOCTT GCAGCCOGGT GOCCCTGOAC 180 

GAGAGCGACC CABACTQQTG CAAGACGGCG TOSGGCCACA TCAAI3CC3GCC G&OXSAACGOG 240 

TTCATOOTAT OGTCChAGAT GGAAOGCAGG AAOATCATOQ AaCAGTCTCC GGACATGCAC 300 

AKOBCCGUGJi TCTCCAAGAB QCTOaSCAAfi GGCTCGGAAAA TGCTGAAC3GA CAGCGAGAAG 360 

ATcccerrcA tcgoogaggc gg»gcggctg cogctcaagc acatggcgga ctaccccqac 420 

TACAAGTACC GGCOCCGSAA AAAQOCC^AA ATGGACCCCT OGGCCAAOCC CIUSCGCCAGC 480 

CAQAOOCCflG AGAAGAGOGG GGCXX3GCGGC GC3CGGCQQGA GCG0GGGC9GG AOGCGOOGOC 540 

GGTGCCAAGA CCTCCAAjiGQ CTCCAQCAAG AAATGCGGCA AGCTCAAGGC CX!CCX3CGGCC 600 

aCQGQOQCCft AGGC3GGGOSC GGGCAAGGCC GCCCAOTCCa GGGACTCAGGG OGGCGCGGGC 660 

GaOGACTAOG TGCTGGGCA3 OCTGCSSOSTG A6O0GCTCGG GC3GG0GGOCTO CX3CGQGCAAG 720 

ACQGTCRAflT GCGTGTTTCT GGATGAGGAC GAC3AC33ACX3 A0GAC3GA0GA CX5AOGR0CTG 780 

CAGCTGCAGA TCAAACAi3GA GC0GGA£X3AS GAGGAOGAGO AACCAOOGCA GCAGCIVGCTC 840 

CrOCZUSCOGC CGGGGC^GO^ 6COGTOGCAG CTGCTGAQAC GCIACAAOGT OGGCAAAgra 900 

CCCGCCAGCC CTACX3CTGAG CAQCTCQSQG GASTCCGGCG AGGGAGCGAG CCTCTACGAC 960 

GAQGTGCaGG CCGGCGGGAC CTCGGGCGCC GGGGGCC3GCA GCCGCCTCTA CTACAGCTTC 1020 

AAGAACATCA CCAAGCAGCA CCOGOOGCCG CTOaaacaGC CQGCGCTGrc GCCCCICX3T0C 1080 

TOGCGCTCOa TGTCCSUZCTC CTCGTCCAGC AGCAGOGGCA GCRGCAaCQG CAGCAGGGGC 1140 

GAiSGAOGGOG AOGACCTGAT GTTOGACCTQ A£aCTT6AATT TCrTCTCAAAG COC0C31C3UC3C 1200 

oocRooaaGe Aac3uaCT6caG osgoggcscg gcggocggga aoctgtccct gtcgctggtg laso 

GATAAGGATT TG6ATT0GTT CAGOGaSOac AQOCTGOGCT CCCACTTCGA GTTCCCCGAC 1320 

TACTGCACGC GGOAGCXGAa CX3AGAIGATC GOGGGGGACT 6GCTGGAGGG GAACITCTCC 1380 

GAGCIGGTGT TCACATATTG AAAGQCGCCC GCTGCTCGCT CTTTCTCTOG QAGGfiZGCAG 1440 

AGCTQGC3TTC CTTGGGAGGA AGTIGTAGTG GTGATGATGA TQATGATt3kT AATGK7GAT9 150O 

ATGATGGTGG TOTTGATGGT QGCOarGGTA GGGTGGAQC3G GAGAGAAGAA OATflCTGATG 1560 

ATATIX^TAA GAT6TCGTGA OGCAAAGAAA TTXSQhAAACA TGA'TGAAAAT TTTGGTGGAQ 1620 

TTAA3«5XQAA ATGAQTAQnT TTTAftAChXr rCTTOCTGTCC TTTTTTTSTC GCCCCTGGCT 1680 

TCCTTTATOG TGTCTCAMSG TAGTTGCATA CCTAfiTCTGG 7USITGTGA.TT ATTTTCX3CAA 1740 

AAAATGTGTT TTTOTAATTA CTATTTCTTT TTCCTGAAAT TCGTGATTGC AAJCAAAGGC2^ 1800 

GAGQGGGOQS CGCGGOGGRG GGGAGOTAGG ACOCGCTCCQ taAAGGOQCTG TT1X5RAGCTT 1B60 

GTOSGTCXTT GAAQTCrGOA AGAOGTCTGC AGAGGAOOCT TTTGCJC3V9Ca CAACOXITTAC 1920 

TCXAGOGAGX TGGTGGAiGAT ATTTTTTTTT CITAAGAGAA CCTAAAGAAC TGGTQATTTT 1980 
TTTTIAACAA AAAAAGGS 

Seq ZD HOs 512 Protein sequence 
Protein Acceeeion #i 2lP_003Dd9.1 

• 1 11 21 31 41 SI 

! I I I 1 I 

KVQQ|ffiSI.BIV ESNI.PREAIJ3 TEBGBFMACe PVALDBSDFD WCKTA8(3HIK RPMKAItHVWS 60 

KIERRKIHSQ SPDMSNAEIB KRLGKSKICML KD8EKIPFIR BAERIiBZtKHM ADYPDYK3fRP 120 

RKKPKMDP3A KPSASQSPEK SAAGGGGG3A GOaAfiOAJCIS KGSBKKGGKIi KAFAAAGAKA 180 

GAGfOiAQSa) ^CGGAGDDYVL GSLRV&GGG6 GQAiGlCrVKCV FUSBEDDDDD nDDELmiK 240 

QEPDBSDBBP PHQQPU^PPG QQPSQZiXtRRY BVAKVRASPS LG&&AESPBG ASIa^EVRafi 300 

AISGAGGGSR LYYSFKNITR QHPPPLAQPA LSPAS8RSVS TSSSSSSG95 SGSSGEDADD 360 

U4PDL5LSFS QSAHSASEQQ liGOGMUVBHIi SSiSIiVDKDU} SFGEGSLGSE PBPPDYCTPE 420 
I.SEHIAGDHI1 BANFSDLVST Y 

Seq ID HOf 513 DHA aegvience 

Nucleic Acid Accession. V: cat duster 



1 11 21 31 41 51 

1 I I I t t 

GGTGGAlCCrA AKTCTGMIAA CCQBCmm MffCMOTTH. TTOSTGTTAT TATAGlTAGAG 60 
ATTGOTAATC TACAGTAAtSA TmO^OTTA GGKTTTGAOA TTATGATAAT AACTAATAOA 120 
ATATTTCTAA ATTBSAATTA GAAGATTGTT GZKIGACAGA GAGTCAGOAC TXGCCATTTG 180 
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□CAAAClvrOW AAGTCATTGT TTGGTOTOTA ATROTACMA ATCATCTTIaC TlAACAGAGA 240 

AAGC3ATATCT GTTGCTCCCG AATeAA&CAA TTTTTCTCAA ATftOWSOSCC CAQAATTGGT 300 

CTCTGACZ^T TAATAAAOAC ATCAAAGATA QCAAAATGAT TTTTATATCT TAGGGCCAAT 36D 

ACTACCAATT TAATAATTAA AACAAGTTCT GGTGAGCTCT GAACTTGGCA GAATTQQTGG 420 

CAACATAGAC TTTC3GATTTT CCAAATTCCC CACaVTAAAAC lU^ASOGGATC AACTAGATAG 480 
AAAAACCA6A AAOCTTEGGA AATATCTGTT TAAAAAAAAA AAAAAGTCGA CGOQCSGCC 

Seq IK> EfO: S14 DWA. sequence 

Nucleic Acid Accession 41 1 CAIT duster 



1 11 21 31 41 51 

( [ I ) I 1 

GOAOCCACAG TGAAAGfXCAA GAATGTCAGT GATTOCACAT TTAATATCTA CATTTTTGCA 60 

OGGCASTTAC TCTTTTGrAlQ TATAACATTG AGCTGAIA6C ACATA3TOTA GAC3UVST GAA 120 

TACAI3I3ATTC TCTOSGTTCT ATTOCCAGAA GTCrOSAGQT CATXraOATA STTBIGOGCC ISO 

CTTG6CTTCA CTCTGACTTG TGTGACACAT AAAAATT6TG ATGAAATOTC CZRXAGATGT 240 

CCTGCAOaTC TTAAAAGARC CTTTCCAAAC TATGAAACAG CCCAGCAGCA CTQAGTTAQA 300 

OGTAARTTCT GAACOCTTOA ACACTAAAAC TATTCTRACT QCACATAQAA tTIGGCAAGTA 3 GO 

GCATTCTATa TCTATQRACA GTATGTCTTT TCTATATAAC AOAGAAAATC TTTTTAAOCA 420 

AACTACTC&S TTTAAAAOCT AATTCTTCTC ATAKCCCCNS TACTTTTOAA TGAltfSACAS^A 4fiO 
TCAATOCAAC AGTACACCCT TATTCAGGCA TTTGAAAQAA AGAATTGGAfS ATCTAGTTTG 540 
TATCI^ATAT TATAAATTAG TATGGTTTAO TCTTTGTCAT GAAATTCTAC TTAATTTTTG 600 
OACTSWTAGGT TirAAGAATGT AAGCAGAAGT TCTCSCACCAA TCAGAATAAG CTACATTATG 660 
CTT6A6T6AC AACTACTaXA AX(3AC!AAAArr ATC2USTGGCT TAATACAATG 6TTTTTCTCT 720 
CSVTACTTGTT CATAAA6AGT CASCAAGQAC CCIGCTCATX ATG6TCCCTC JU?I3GACCC3US 7B0 
GGTTGTTGGA AQCTCCACCSV VTTTAGA.TA6 CTCCCTTCAA AGTCAGCCAT GTTTGGCAGT 840 
CCATGTOCXIC CAACAGGCTG OCAAAATTTQ aCTCTGGATG GCTTCAAtaOA TTGAGCATC6 900 
GGCAGTTTRA ATGCTTTCAA CATQGAAAGT GtSACAOCSQaC CACTCCSCACT CACATCCCTT 960 

GGGCCAGAAC TAGGTCACIQ GQCOCOGACC TAACXTOGGA GGOTTGGGGA A1?T8XAATTC 1020 

CTCCATQTAC CCAAAGTGGA GAGAAGCCAG ATACTQAGAA ACATCAATAA TGGCTAACM 10 SO 

AAATCCATTC TACCATTCCX: TTTOCCTAAA GnGAAAAGAT GAGTACTTTC A1CAATTT6T 1140 

AAACXGTACT TTTOAAtaTAA ATCCTGGTAG CTTGCATGGQ OGCTGGATTT CCRGAAAGOC 1200 

ATATGTAATT TGGGAATGAC ATTCaCTTAA GCTCATAGAA TATCATTATT TGArrGTAAAA 1260 

TGCCCrCATT IXJCAATACAG C2U:CAAAAT6 CACCAAC3C3kC AAAACCCCCC TCC3CCAOGGG 1320 

GCOOSGGGTC OCTATTCCOC TOCATCCCTST TAAAVG2IGGC ATTCIAXOKT TXQCUAOXSGA 1300 

AGCOCAOTXG TASTCCAAAG AATTTTACTT AATTCAAGAA TTATTCTCAC TGAATATOTG 1440 

CCaSTTCTGA AAGGAAXGCA AAQTC34AATT TirGCATCTTC TTTGCTCAAO GGCCmTAGA IS 00 

TGTAACAACA CAOACATGAT A13VAGGCTGA CAATGACATT ATGATTMAA TATGTTAAAC 1560 

AACTTATTAA ATTGTGAATC AACAAAAAAT TATGTXCITT ATTTTATQCT TTTGCATAGT 1620 

CCTGACTC3W: TGCCTACATA CCCCrCTffSfr TCCTCAOTTC TTArCXSCTG& TITCTTACM 1680 

GATGGCCCAA GACAGCIGTA GATGTTTTTA TTTAGOU^AA AAAAAAAAAA AAAAGTOGAC 1740 
GOQGCCGOOA AnTTAGTAG 

8eq ZD 3ilO; 515 TOA gegugnea 
Kucleic Acid Acceesicn. 4t: SIHJ012427 
Oodlng sequence: 43. -924 

1 11 21 31 41 51 

I I I I I t 

CTTOTGOTTC CXCTCEACTT GGGGAAATCA G6TGCAG0GG OCKZGGCTAC AGCAAGACOC 60 

OCSCCGGATGT GGGTGCTCIG TGCTCTGATC A£»GCCFTGC TTCTGGGGGT CACAGAGCAT 120 

OTTCTOTCdA ACAATSATGT 1?TCCTGTGAC CaCCOCTCTA ACRCOGTGCC CTCTGQGAGC IflO 

AACCAGGACC TOGGAGCTGG GGCCXSGQGAA GAOGOCGGGT CGGATGACAG CAGCAGCCGC 240 

ATCATCAAXG GAXCCQACTG 03ATATGCAC AGCCAGOOGr GGCAGGOCGC GCTOTTOCTA 300 

AGGCOCAACC AOCTCCACTQ COGOGCaGGTG TTGGIGCKTC CACMSTQOCT GCXGAGGGGC 960 

GC3CChCTI3CA GGAAGAAAQT TTTCAGftGTC 0C5TCTCGQCC ACIACTGOCr GTCACCAGTT 420 

TATGAATCTG GGCAGCAGAT GTTCCAGG6G G^CCAAATOCA TCCXXX»CCC tCGQCrACTGC 460 

QRCCCIGGCC ACTCTAAGGA CCTCATGCTC ATX^UUkCTGA JUIAGAAGAAT TOOTCOCACT 540 

AAnSATGTCA GAC3CCATCAA CGTCCCCTCT CATIGTCCC7 CTGCTGGGAC AAAGTGCITG 600 

GTGTCVGOCT GSGGGACAAC CAAGAGCGCC CAAGTGCACT TOOCTAAOGT CCTCCAGTGC 660 

ITGAATATCA GCGTGCTAAQ mUQJUUlAGG TGOGAGGATG CTTACXXX3AG ACAlGATAGAT 720 

GACACCSVTGT TCTGOGCOGG TGACAAAiQCA GGTAGAGACT CCTOCCAGGG TOATTCIGGG 780 

OGGCCTGTGG TCTGCAATGG CTCOCTGCAG GGACTCGTGT CCTGGGGAGA TTACCCCTOT 840 

GCX3GGGCCCA ACAG&CCQGG TGTCEACACG AA£XnCTGCA AGTTCACCAA GTGGATOCA6 900 

GAAACCATdC AOSOCAACTC GTG2U5TCAXC CCAGOACTCA GCACACC3GGC A TCCOS ACCT MO 

OCSQCSMSGQh CAGCQCTGAC ACTCCTTTCA GACCCTCATT CCTTOC3CAGA GATGT!C@h6A 1020 

ATOTTCavrCT CTCGAGOCCC TGACOOCATC TCTCCTtScaAC TCAGGGTCTG CTTCCGCCau; 1080 

ATTG6GCTGA CCJGTGTCrcr CTAGTTGAAC CCTGGGAACA ATTXCCAAAA CTGrCC3«3GG 1140 

OGGGGGTTGC ^XCTCMOCr GCCTGQGSCA CTTTCATGCT CAAGCTCAlSG 600CATCCCT 1200 

TCTcracnoc tcigaoccaa, atttwstcgc agaaataaac tgagaagtgg aaaaaaaaaa 

Se^ ID NO? 516 Protein sequence 
Pcotein Aoceaslon «s ]IP_oa6559 

1 11 21 31 41 SI 

t i I I 1 t 

MATARPPMMW VLCALITALIj LGVTBHVIiftK NDVSCDHPSW TVPfiCSNQDIi GM5RGEDARS 60 

DDSSSEIING SDCDHHTQPW QAAIiLLIlPJJQ IiYCQAVLVKP QWUbTAAHCR KKVFRVHliGH 120 

VSLSPVIESQ QC3IMPQGVK8I PH9GVGBCGH SaDZJOiIXUil SRIRPTKDVR PINVSSHCPS 180 

AGnCCLVSGV GTTKSBQVHF PKVLQCLHIS VliSQKRCEDA YPRQZDDTKF CSUSDKAGSOS 240 
OQGDSG6PW CMGSLQGLVS HGDYPCARPIT RP6VYTII££1C PTKSaQBriQ AH8 
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Seq ID NOs 517 DWA secfttencg 
Hudeic Acid Accesslcoi tts £IH_ooi'7l9 
Coding sequence s 123--141B 
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1 
1 

GGGOSCAGGG 
CXGCCACCTG 
C6ATGCACGT 



GCTTCATCCA 
CCATTTTOGG 

GCCAGGGCrr 

TGGAAChTGA 
TTTCCAAOAT 
ACATC06GGA 

AGSAGGGCTG 
0GCAChACC7 
AGTXQGOOGG 
rCTTCAfttWC 
GCCAGUICOG 
A!9AACAaCA6 
GAGACCIGG6 

AGACGCTQGT 
AGCTOyiTGC 
ACSUQAAACAT 
TTGQGGCGAA 
CTGCCTTTTQ 
AAACATGAGC 
TCXrTACAAGC 
GCGOGGCCAQ 
TTATGAOOQC 
GGGCACKTTG 
CAATA3VAAOT 



11 
I 

GGGCCCGTCT 
GGGCGGTGOG 
GCGCrCACTO 
GCTOOGCTCC 

ccGGcaaccTC 

CTTGCCCCAC 
GCTQGACCTG 
CTCCTAOOCC 
TAGOCATTTC 
ClUUSGAATTC 
CCCAOAAQGG 
ACQCTTOGAC 
CZAQ&GAATCa 
GCTGGTGTTT 



OCTGATTGGG 
CAOOGAOGTC 
CTCCAAGAOG 
CAGCGAOCAlS 
CTGGCAGGAC 
TGCCTTCCCT 
CCACITCATC 
CATCTOCGTC 

GTTTTTCTGG 
TGAGACCTTC 
AGCATATGGC 
TQTOCAQCKZA 
GTCATIGGCT 
CTAC3CAOCXA 
GTGTCTGTQC 
AAKBAATf? 



21 
I 

GCAGCAAGTG 
QGOXXSGAGC 
CXTAGCTGCXaG 
GCCCIGGCOG 
CGCAOCCAOS 
CGCCOGGGCC 
TACAAOGOCA 
TACATUSGOCG 
CTCACCOACQ 
TICCACCCAC 
QAAGCrGTCA 
AATGAGACGT 
GATCTCTTCC 
GACATCACAG 
CTCTCGOTGQ 
CGGCAC6GGC 
CACTTCCQCA. 
GOCAAGAftCC 
AOQCAfiSCCT 
TGGATCATOG 
CTGnACrCCT 
AACCCGOAAA 
CTCTACTTCG 
GCCTGTG6CI 
ATCCIOCATT 
GCCTCXS^TAT 
TTTTGATCAG 
AAACCTAOCA 
OOQAAGTCIC 
GaCCAGCCAO 
GTkAAGGAAAA 



31 
I 

AOCXSACOGCX: 
CCGGAGCCCG 
CGCX53CACAG 
ACTTCAQOCT 



CGCAGCT{3CA 
TGQCGGTGGA 
TCTTCAGTAC 
CGQACATGGT 
GCTACCAlCCA 
CQQC3U3CCGA 
TCOGGATCRG 
TGCTCmCAG 
GC2UZGAGCAA 



CCCAGAACAA 
GCATOCCQTC 
AGGAAGCOCT 
(31ZAAGAASCA 
OGCCTSA3V9Q 
ACATG2UVOGC 
GOOTOCCCAA 
ATGACAGCTC 
GCCACTAGCT 

cccjCRACrrr 

TTTTTCRGTG 
GOAAAAAAAA 
AGOCATGCAC 
CaaXGGGAGG 
TTG&OCCGGA 



41 
I 

GGGACGGCCG 
GQTAGOGCGT 
CTTCGTGGCG 
GGACAACSAG 
GA1GCM30CSC 
QOGCAAGCAC 
GGAOGGOGGC 
CXAGGGCXXrC 
CATGiAGCTTC 
TCQAQAlSTTC 
AITCOGGATC 
0C3TTTATCRG 
CCGTAOCXITC 
CCACTGGGTG 
TGGGChGAGC 
GCAGCCCTTC 
CAOGGGGAGC 
GCGGATGGCC 
CGAGCIGTAT 



CAOCAACCAC 
GCCCTQCTGT 
CAAOGTCATC 
CCXCOGAGAA 
GCCaOQAACC 
AAAGGTGTGA 
GCAGCATCCA 
2U:AA£GCATA 
ggactcgttt 
AAGGGGG06T 
AJGTTCC3X3XA 



51 
1 

AGAGCOGGOG 
CTCTGGGCAC 
6IGCACTCGA 
GAGATGClCir 
AACTCGGCAC 
GGGCCCQQCG 
CCrCOJGGCCA 
GTCAACCTCS 
CGGTITGAIC 
TACMU3GACT 
GTGCTCCAGG 
TGGQCCTCGG 
QTCAATCGGC 
ATCAACCCCA 
ATGGTGGCXT 
AAACAGCGCA 
AACGTGGCAG 
GTCAGCTTCC 
TACTACTGTG 
QCCATC6TGG 
GOGOCCRCOC 
CTGAAGAAAT 
TTCAGACCCr 
AQCAGACCAA 
GRGTATTAdOG 
ATGAACAAGA 
AAGAAAAATQ 
CCAGAGGTAA 
GGCAAOGQI^r 
ATAAAICTCA 



GO 
120 
180 
240 
300 
360 
420 
480 
540 
«0Q 
£€0 
720 
7B0 
840 
900 
960 
1020 

loao 

1140 
1200 
1260 
1320 
1380 
1440 
1500 
1S60 
1620 

leao 

1740 
180D 
1860 



3e<I ID NOi 518 Pargtein Bequence 
Protein Acceasion #s MP_001710 

1 11 21 31 

1111 
HSVaSLSAAA FHSFVAIiWAP I,FlJaRSAIiftD FSLD1ISVH&& 
IX^LPHRFRP HLQSKBNSAP MFMIiI>LieNAH AVBBGGGPGQ 
IjQDSBFr/mA DMVMSFVNIjV SHDKHFFHPH YHHREPHFDL 
RISVYGFVLQE HLGRESDXtFL X^SRTLHA^E 
UNliGLQLSVE TliDGQ&IHFK tiAGLIGSHGP QNKQPFHVAF 
QHRSETPKHQ EAUWANVAB NS5SPQBQAC KBEELYVSFR 
GECftFFLKBY MHATHHAIVQ TLVflPIMEBT VPEPCCWTQ 
RHMWSACBC H 

8eq ID mOj S19 DMA sequence 

Kudeic Acid Acceesion #; Eos sequence 

Oodiiig seqaenoes 264.. 762 



41 
1 

FXERRUU3QB 
QGFSYPyKAV 
SKIPBGEA-VT 
BOWLVFDITA 
FKATEVEFSS 
DZjGHODHIIA 
UiAISVXiYFD 



51 
I 

RREHQBSIIjS 
F&TQGPPLAS 
AABFRIYKDV 
TSNHWWNPR 
IRSTGSKORS 
FEGYAMnrCB 



1 
1 

OOCTGCZCCA 
TCATOGCGGG 
TTOCTCAGAC 
TOCCGGTOGC 
AACTTATCAG 
TOAXGGOGGT 
TCnBC!IU3QC3V 
CCAGGGOCAG 
CAGAGAGAGA 
ATAGCTTGGA 
ACACACAAGT 
AQAAOVES^ 
TCTGGTA'XTT 
OAATTCCAAA 
GGAAATGCCA 
CAAAAAOAAT 
CCTGGTCIGT 
AAAACAGGCC 
TAAAGTATGR 
AAATTTTAAA 
TGACITOOCT 
AAAAIAGCAA 



11 
I 

GTCACACC06 
ACTAATTTTC 
TGAGAACTOT 
ACTACTrGAQA 
CAAGGAGCrC 
Q&I&CCCCSC 
CATGGOOCAC 
CCCXGGCCAC 
CATCCCAATG 
TAGCTCCTGC 
OSICXTTTCT 
GGMUUrCACA 
TGTCAACOCT 
TATTTTTAAT 
TTTTTCCCCC 
GTOG&GAAGA 
ACCCAAAAAA 
AATGCCCCOa 
GAACTATGGA 
AAGGTTGAAT 
TTGAGCTCTT 
CChCCACGA 



21 
I 

GAAGCTGacr 
CTTftAAATTT 
TTCCAGTATA 
GACGAOGTOC 
ATCATGCTGA 



31 
I 

OGTCCAOGCA 



41 

1 

CAGCTQMGC 



TACKTCAAGT 

CAGGGTGGTT 
CA6AA0TCAT 
CClGCtTCCT 



AnSAGTGftAC 
CATCATOCS^ 
TCTGATTOCC 
AGTTGGCCTC 
GACOCIGSAG 
GATTAITGTCA 
GCICTGTCIG 
GGGGTCCAGT 
ITAAACAAGG 
AAACTGATAA 
GCIOZTOGIT 
CAAGAAAGGT 
GICCATCAGC 
C3VGCTGTTGT 
TAATTATTGG 



CIGCXGTCAA 
TTTACAGGCA 
CTQCCTGCCA 
AACTAAAAAA 
ATQTCAATCC 
AGGCAGOGGA 
TCTCTATGOA 
CATGGGGCTC 
ATACACAGAG 
CCTCAAAAAC 
TQAGATCAGA 
AGAGATAGTA 
AGAGTTCTAT 
CAATAAAOUL 



CACTGAC3AIC 
CCTGAAAGTG 
GOAOGTXriXGG 
CAOOGGCATC 
AGOaaCGAGX 
AGAGATGAAG 
TaAC!AQCXSAC 
GGCCACAGAG 
TGSACTCOOOG 
A(3AA»83WCAC 
ATATGATCAA 
TTCTTACATT 
ACAAGTCTAT 
(^TCCTGAAGA 
AAAAACAAGS 
TGTTAGGAAG 
GTGAAGTCTC 
TTGGCAATCr 
CTTCTTTAAA 



60 
120 
180 
240 
300 
3fiO 
420 



51 
I 

ATOAiOQAAAC 
TCAACTGACC 
TCCAGCACCC 
OCTGAGOC3CC 
CATGGCTTAG 
AACXACCTGC 
CrCCAGGTTC 
GAGACTCACjn. 
ACAGCCTCAG 
GATGTGGATT 
CTOGAlCTATG 
AAQCCCAOTT 
GTGGCCATGT 
TAATTTGTAG 
GGAGACAGGC 
COCATGGACT 
CTlGGClGOe 
AACITTOUSG 
TCCCCAGGGA 
CATGGTTAAA 
AGITTTAAAT 



60 
120 

lao 

240 
300 
360 
420 
480 
S40 
€00 
660 
720 
760 
840 
dOO 
96D 
1020 
1080 
1140 
1200 
1260 



ScMX ID NOs 520 Protaln seqiusnee 
Protelxi AccefffiiosL #s Bo 
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1 11 21 31 41 51 

) 1 I I I t 

MI-TEVMBVWH GLVIAWSIjF LQACFLTAIN YLLSRHPIAHK BEQILKAASL QVPRPSPGHH GO 

HPPAVK&MKB TQT£RDIPM8 DSLYBHDSDT VSD&IjDSSCS SPPACQAT£D VDYTQWFSD 120 
FQEUOSDS?!. pyEWIKKlTD XVHVHPSEtHK PSFWYPVNPA LSfiPABYDOV AM 

Seq ID NOr 521 DHA sequence 

Nucleic Acid Accession Bob sequence 

Coditig sequence s 107 32a 

1 11 21 31 41 51 

11)111 

CTGCrCTGTC TtSAfiCCAQCXa OAATATQATC RAGTQQCCAT STGAATTCCA AATATTTTTA 60 

ATGGGGTOCA GTTCTCTATC GATTCTTACA TTTAATTTGT AGGGAAATOC CSWTTTTTOCX: 120 

CCrtTAAACAA GGCAOTOQGCSC TCACAAGTCT KXGGTtiSriiCUS GCCAAAAAGA AT6TGGAGAA IBD 

GAAAACTGAT AAATACACA6 AGGTCCTCAA GAOOCATGGA CTCCTGOTCT GTAGCCAAAA 240 

AAGCT<?rTC36 O^rCCTCAAAA ACAAAAACAA 66CTT<3GCTG 6GAAAACAGG CCAATGCCCC 300 

(SOCAAGAAAG QTTGAGATCA GATGTTAGGA AGAACTTTCA OOTAAAiCSTAT GAGAACTATG 360 

BAQTCX3LTCA QCAGAQATAO TAOTGAAQTC TCTCX3CC3U3G C3A2UUVITTTA AAAAC3GTTGA 420 

ATCAGCTGTT 6TAGAGTTCI KTTTOSCAAT CTCSVrGGTTA AATGACITCC CTTTCAGCTC 480 

TTTAATTATT GGCAATAAAC AACTTCTTTA AAAOTTTTAA ATAAAATA6C AACCSUCCAOC 540 
A 

Seq, ID HOr 522 Protein sequence 
Protein Accession #s £os sequence 

1 11 21 31 41 SI 

1 I I I i . I 

HFFFFUCQGM 6IiTSI.nRQAK VSXVEKKXDKX TBVLKlHGMt VCTQXSCSFIi KNKEIKAlHIiQK £0 
Q»IAPABKy& IRC 

Scq ID NOs 523 DIja. sequence 

Nucleic Acid Accession Bos sequence 

Coding aequencei 211..1B95 

1 11 21 31 41 51 

I I ] I I i 

OQnstrroMsa caoaaccKAaT cacttcct<x: acottctcqt QCxtstSGOGGG aggwscgqas 60 

GOGGCTTOGG AGGCZUaCCTG CTCICCAGTC CCIATOCAOC CACA06TOTT TTG0GT06GA 120 

<32UQaAA'ITAT CTBATAAAAT TCCZGQ9TTA ATATTTTTAA AAACOQAGAQ TTTT^AAAAA ISO 

TGATTTTTTT GGCTOGAAAA TGACCTTTTT ATGCTTOGAA GGAGTTTGTC AAOCAGCATA 240 

QTGCTTTTTC TTTTCTCTTC TTTTTCTACG ATAAATC3AAA OCATTTCTTC AAOAAAAAGG 300 

CAC7VGGTTCC TTGAACAGCT OGATTCTGAT 6GCACX3VTTA CTATAGAGGA GCAGATTGTC 360 

CTTGTGCTGA AAGGGAAA6T ACAATGTGAA CTCAACATCA CAGCTCAACT CCAf^QAQQGA 420 

GAAiGGTAATT GTTTCCCCGA ATGGOIVIOGA CTCAITTCTT GGGCCAGAGG AACAGTGGGS 4,00 

AAAATATCGG CTC3TTCCAT6 OCCTOCTTAT ATTTATGACr TCAACCATAA AGGAOTTGCT 540 

TTOOGACACT OTAROCSCCAA TGC3AACATGG GAirTTTATGC ACAGCTTAAA TAAAACATGG 600 

GCCAATTATT ObaACTGCCT TOGCTTTCTG CAGCCA3ATA TCAQCATA<?q AAAQCAAOAA €£0 

7TCTITGAAC GOCTCTATST AATGTATACC GTTGGCXACT CX3VICICTTT TGGTTCCTT6 720 

GClXSlGaCZA TTCTCATCAT TGOTTACTTC AQAOSATTOC ATTQCACIIIQ GAACZTA^lC '7dQ 

O^CAIGCACT TATTTGTGTC TTTCATGCTG AGAGCTACAA GCATCTTTGT CAAAGACAGA 840 

GIAGTCCSiTG CrdkCKSSkOG AGTAAAiSGAG CTG<3?«3TCCC TAAtAATOCA GGATGAOC3CA 900 

CAAAATTCCA T7GAGGCAAC TTCTGTGGAC AAATCACAAT ATATCGGGTG CAAGATTGCT 960 

GIXGXGATOT TTAITTACTT llSClGdCrACA AAOTCAl^TArtT GGATCCTGGT GGAAGGTCXC 1020 

33kCCTGCAIXft ATCTCATCTT TGTGGCSn^TC TTTTOBCnCA CCAAAXACCT QXGGGGCTTC 1D80 

ATCTTGATRO QCIGQGQQTT TCCnSCASCA TITGTTGC3U3 CATGGGCTQT QGCACGIVSCA 1140 

ACTCTOGCTG ATGOGAGGTG CTGGGAACTT AGTGCTGGAG ACATCAAGTG GATTTATCAA 1200 

GCACCSGATCT TAGOUSCTAT tl^GQGCTGAAlT TXTATTCCG'r TTCTGAATAC GGTl^AGAGl^ 1260 

CTAGCTACCA AAATCTOGGA GACCAATGCA QTTGGGCATa ACACAAGGAA GCAATACAGQ 1320 

AAACIGGOCA AATCGACACT GGTGCTGGTC CTAGT CTiraS GMSTGCATTA CATOSIGT T C 1380 

GTATGGCTGC CTCACTCCTT CACTGGGCTC GGSTGOGAGA TOCGCATGCA CTGTGAGCTC 1440 

TTCTTCAACT OCTTTCAG6G TTTCTTTGTG TCFATCATCT ACTGCTACTG CAATGGAGAO 1500 

GTTCAGGCAQ AGGTGAAOAA GATGIGQAGT CGGIGGAATC TCTCXXSIGGA GIGGAAAAGG 1560 

ACACCQCCAr GTGGCAGOOG CAGATGO^SC TCASIGCTCA CCAOOGTGAC GCACAGCACC 1620 

AGCAGCCAGr OlCAGOrGGC GGCCAGCACA CSSCATOGTGC TXAmXTTGC CAAAGCTGOC 1680 

AAGATCOCCA GC3W3ACAGCC TGACAGGCAC ATCACTTT3UC CIGGCTATOT CIGQA9TAAC 174 Q 

TCAGACCAGS ACtGQCfTGGC ACACTCTTTC CACGAGGAGA OCAAGGAAGA TAGTGGGA66 IBOD 

CAGGGAGATG ATAT7CTAAT GGAGAAGGCT TOCAGGCCTA TGGAATCIAA CCGAGACACT 1D60 
GAAGGATGCC AAGQAGAAAC TGAGGATOTT CTCTGA 

Seq ZD BTOs 524 Parotein seguepce 
Protein Accession #s £o8 sequence 

1 11 21 31 41 51 

I { 1 1 1 I 

MCASSLST8I VIi^F8SF9T IHE8IGSRKR HRFLBQLDSD GTITIBBQIV LVLKA3CVQCE 60 
IKITAgLQBO EGHCPFBMDG LICHPRGTVG KIBAVPCPPY lYDFNHKOVA FRHCKPNGTW 12 O 
DHMU3UIKTW AKYSDCIJiPL QPDISIGKQE FFERLYVMYT VGYSISFGSL AVAILIIGYF 180 
fiRUHCSSmi EMHL7V&FME. &ATSXFVKDR WHAEIGVKE LE8LIMQPDP QUfilEATSVD 240 
ICSQITJGCKIA WMFIYFIAT NYYWXIiTBGL YXJUOiIFVAF F5DTKXIMGF IXiIGHBFPAA 300 
PVAAKA-VABA TlADKRCHBL SAGDIKRIYQ APILAAIGLSI PILPLHTVIiy XATBtlMEXHA 360 
VGQDTRKQYIL KUUCSTZaVZiV IiVFGVHYIVF VOiFHSFTQIi SHBISHHCBI* PFNSFQGFEV 420 
SIiyCYCBGB VQAEVKRMRS RnmjSVDWXR TPPGGSRROG SVIA1*VTHST SfiQeOVAAST 480 
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SMVZjISGKAA KIASRQPSSH ITIiFGYVUfiK SBQDCCFHSSr HBBIKEDSGEl QGDDILMEKP 540 
SRFtlBSHPDT EGCQGETBDV L 

Bcq ID NO I 525 DMft. sequence 
Nucleic Acid Accession #r nm_00504B 
Coding Beqoeacei 143.. 17 95 

1 11 21 31 41 51 

! I I I I 1 

QGCCOGTGGC CCGGGCCCGA CCACCCXZAGC TGOSCSTGGT TACTOGCCAC AABTTlGCrC 60 

TGOGCCAOCC AAOTTGQCaA CXTGGAAGCT TCrOCCX?8QC: TCrGGAOOMS GGTCOCXGCT 120 

TCTTCCTACA GCC6TTOOC5G GCATGGCCasa <3CK3QGGG06 TOQCTCCACa TCTQOGGTTG 180 

GCTAATGCTC OacaGCTGCC TCCTGGCfCAG AGOGCAGCTG QATTCTGAtG GCACCATTAC 240 

TATAGASGAG CAGATTGTCC TTGTOCTC?AA Af^dSAAAGXA CAATGTGAAC TCA&CATCAC BOO 

AGC£CAACTC CAaaaBQQAG AAGGTAATTG TTTGCCTQAA TGQOATGGAC TCATTTCTTG 3^0 

GCCCA(3AGGA ACA6TGGQGA AAATATCGOC TOTTCCATeC CCTCCTTAXA TTTATQACTT 420 

Caul£3C3mkAA 06AQTTGCTT TCCGACACTS TAACOCCUVT QGAACSUFGOG ATITTAIGCA 4B0 

CRBCTTAAAT AARACATGGG CCAATTATTC AGACTGCCTT CGCTTTCTGC AGOCRGATAT 540 

CAGCATAGGA AASCAASBAT TCTTTGAACG CCTCTATOTA ATOTATAGOG TTGGCTACTC fiOO 

CATCTCTTTT GGTTCGTTGG CTGTGGCTAT TCTCA.TCATT G6TTACITCA GACSATTQGA £60 

TTTdCACTAOa AACTATATCC ACATGCACTT ATTTGIGTCT TTCKTOCXOA GAGCIACAAG 720 

CATCTTTGTC AAAGACAGAG TAOTCCATGC TCSUCAT2USGA GTAAAG6AGC TCSCaAOTCGCT 780 

AATRATGCAG GATGACOCAC AAAAITOCAT TGAGGCAACT TCTQTGOACA AATCJ^CftATA 84 D 

TATCGGGTGC AAOATreCTG TTOTGATOTT TATTTACTTC CTGGCTACAA ATTATTATTG 900 

eATCCXGGXG GAaSGTCTCT ACCTGCATAA TCraVTCTTT QT<3GCTTXCT TTTOGGACAC 5fi0 

CAAATACXrrO TQQOaCETCA TCTTGATAGG CXGGGQGTTT CCASCAOCAT TTSTTGCAGC 1020 

ATGGGCT6TG GCftOGAGCAA CTCTCGCTGA TOtSQAGOTOC TGGGMdCTTIi GTGCTOQASA 1060 

CATCAASTGG ATTTATCAAS CACCGATCIT AGCAGCTATT UtJOCl'URArr TTATTCTGTT 114 D 

TCTGAATAGG GTTAGACSTTC TAGCTACCAA AATCTGG6A8 ACCAATGCAQ TTQOCiCATGA 1200 

CACAAlS(3AA£t CAATAlCAGGA AACTGGCCAA AT08ACACIG OTCCTGGO^CC TA6TCTTTGG 1260 

AfiTGCKTTAC ATGGTGTTO? TAIXSOCTQCC TCACICCTTC ACTGGGCTCQ GOTGOanOA'C 1320 

CCQCATtSCAC TEGTGAGCTCr TCTTCAACTC CTTTCftGG3T TTCTTTeTGT CTATCATCTA 13 BO 

CTGCTACTGC AATG6AGAGG TTCAOaCAQA GOTGAAGAAG ATGTGGAGTC GGTGGAATCT 1440 

CrcaaTQOAC T<SGAAAAG(3A CAOCGCXMG TGGCAGCOGC Ai3ATQGQGCT CAGlJGCTCaW: 1500 

CACOGT6ACG CACAGCACCA GCAGCCAQTC ACAGGTGGCG GCC2U3CACAC GCATGOTOCT 1560 

TATCTCTGGO AAAGClGCCA AGA'1t:GCCAG CAGACRGCCT GAC3U3CCACA TCACIITAOC X620 

rGGCTAIGTC TGGAGTAACT CAGAGCAGQA CTQC3CIGCCA GACTCTTTCC ACXaAGQAGAC 1680 

CAAC3GAAGAT AGfl&GGfdSGC AGGGASA!FGA TATTCTAA1Y3 GAGAAQCCTT CCA5GCCTAT 1740 

GGAATCTAAC OCAGACACTG AAGGATGCCA iUS^AGAAACT GAGGATGTTC TCI6AATGGA IdOO 

C3VTTT(3TGGC TGACTTTCAT GGGCXGGXCC AATGGCTGGT TGTQTGAGAG GGCXTOGCIG 1660 

ATftCXCCTAT GCTTGAGTTC AAAGGCTQAA AAaTCAGTTA AGGTGTDACT TAAKAATAGT 1920 

TTTTAGQCTC CATGAATTGO CTCCTGTTUUL TACTAAOGAC ATQAAAATGC AACaGTCAAT 1980 

GGAGTAGTXT ATTACCTTCT ATTGGCATCA AQTTTTCCTC TAAATTAATG TATGGTATTT 2040 

GCT<nGTGAT TGTTCATTTT TTSCTGCTTMl TTTTG66TAG AAAAAAQAXT CAAlTGCTTG 2100 

GCTGTAGCTT TCTCTCATAT ATATCACCd AAATATAATQ AAGATCITXT AGTGTGXATC 2160 

ATTTTCCTTT TAGAAACTAO TATTenCXTA TITCrXaCTT TAATCTRCST CTATCRCIGC 2230 

ATTTATTTTG CCTGTGCATA GGAGCAATTA GGATCXAAAA AAATATAIOG GAA6ATAAAA 2280 

GATCTAAGAA CftASTACTTG CTOGAAAAlT AGTTGGCTQG ACATO3ATAA AATAATGCAT 2340 

TTATAACAAT TACATGTGXT TTTGGGAACA AGGAAAAITT CTCAAAAAAG AATATTTCSU: 24D0 

ACATCCCTTC TETTGAAIQG CCTCTTTGXG AGGAGCCAGA OCTCAGBTCT TCACICTITC 2460 

TTCTTXGTAA ftCCATGTCAT GTGGAAAGKT TTCCTCaUSTZ AGTGAGCTT6 TGTCTQCAAA 2520 

TTQAXTTXGS nX3^AAXGl2A lI'l'TGKCAGC AAATCATOCT GCATCXATA.T CmTTCTTG 25B0 

T^nVGAGCTGT TACXAaVTTO TACATGGCAT OTGGQASCftA TIAAAAATTT GTTTTAAAAA 2640 
T 

Seq ID HO: 526 Protein gequence 
Protein Accession #t HP_005039 



I 



1 
1 

HABU3ASUIV 
GNCFPEHDGIi 
KYSDGLRFLQ 
HHIiFVSFMLR 
VHFIYFLATH 
XJOIABCHBLS 
ZiAKSTLVLVL 
QAHVKRMH5R 
lAfiRQPDSHX 
GGQGETEDVI* 



Saq ID KOs 527 DHA sequence 
NUdelc ACld Aacesslon #> XM_03«683 
CXsdlng eequences 3B..3655 



zcnPRijrvGK 

P0Z8IGKQSF 
ATSIFVKDSV 
YYVIIIiVEGIiY 
AGDIKHZVQA 
VPGVHSfXVFV 
msOtiGVtlHKKT 



21 
1 

LASAQiLDSDG 
ISAVPCFFYX 

FERLYVHSrrV 
VHAHIGVKSL 

PrLAAlGLSIF 

E«CX3SRK068 VIiTTVTHSTB 



31 
I 

TITXBEQIVL 
VDPNBKSVAF 
GTSISFGSIA 
ESLIMQDDPQ 
SDTKYIiKQFZ 
IXrFLSITVRVXi 



41 
I 

VliKAKgQCBI* 
JtHCNSEIGmD 
VAIIiXIGYFa 
HSIEATSVDK 
IilGHGFPAAF 
ATKXVSmAV 
FHSFQ6F9V& 
SQSQVAASTa 
GDDIXMBKSS 



51 
1 

KIIAQLQBGB 
7H!IS£NmHA 
RLECTSKYIH 
SQYIGCKIAV 
VAATO^VABAT 
CTDTKSQVSK 
IIYCYCHGBV 
KVIrlSGKAAK 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 



1 
I 

□CTTTGCCCA 
GATAGCAGCC 
AATOOTOCAA 
CAGCACAAAT 
GTCASCQAC3G 
AAA2U3ATTTT 



11 21 31 41 51 

11111 
QTAGTTGGAA AGTGAACTOG ACTOSTGATG GTTCTOCTGT CACTTTGGTT 
GCTCTOOTAG AGGrTAGGAC TTCAGCrTGAT GGACAAOCXG GXAATGAAGA 
ATAGATTTAC CAATAAAOAG AXAXAGAGAG TATGAGCTGG TGACTCCAGT 
CTAGAAGQAC GCIAXCICTC CX»TAC1TCTT TCZGGGAGTC ACAAAAIUBAG 
GAOaXGTCTT GCAACCCIGA QCIUSITGTTC TTTAMVTCA GOaCATTTOG 
CATCEGCGAC TAAftGCCCAA CRCTCA3^TA GTAGCTGClG GGGCTGTTST 
GAGACATCTC TOCSTGOdGS GZUUTATAACC GATCCCATTA ACAACGATCA 



60 
120 
180 
240 
300 
360 
420 
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ACCAGGZ^AQT GCTACGTATA 6AATCOGGAA AACAQAGCCT TTeCRGACTA ACTGTaCTTA ABO 
TGTTGtJTGAC ATGGTGGACA TTCCRQGAAC CTCIGTTGCC ATCAGCAACT GTGATGGTCT 540 
GGCTGGAATa ATAAAA3«?rG ATAATGAAQA GTATTTCATT GAACGCTTGG AAAGAGGTAA 600 
AGAGATGGAG GAAGAAAAAG GAROOATTCA TGrrTGTCTAC AA<3A(3ATC3«3 CTGTAGAACA 660 
5 GGCTCCCATA GAC2ATQTCCA AAGACTTCCA CTACAQAGA.6 TCGGAGCTGG AAGGCCTTGA 72 D 
TGATCTAG6T ACTGTTTAT6 GCAACATCCA OCAGCAGCTG AATGAAACAA ^XSAGACGGCXS 180 
CAGACACGCO GOAGAAAACG ATTACAATAT OGA09TACTG CTGGGftSTGG ATGACTCT6T «40 
QGTCCGTTTC CATGGGAAAG AGCACGTCCA AAACTACCTC CTGAOOCTAA TGAACATTCT 901) 
GAATGAAATT TftCCATGATQ J«5TCCCXOGG AGTGCATATA AATC3TGQTCC TGOTQCQCRT 960 
10 GATAAX6CTG GGAIATGCAA AGTCCATCA6 CCTCATAGAA AGGGGAAACG CATCCAGAAG 1020 
CTTGQA6AAT C3TGTGTCQCT GGGGOTCCCA ACAGCIiftASA TCTGATCTGlk ACCACTCIGA 1080 
ACACCATGAC CATGCAATTT TTTTAACCAG GCAAGACTTT OGACCIGCT6 GAATGC7VAGG 1140 
ATATGCTCCA GTCACCGGCA TGTGTCSVrCC AOTGfiGAAGT TGTACCCTGA ATCATGAQGA 1200 
TGGTTmCR TCTGCITTXG TAtSTAGCCCA TGAAAOGQGC CATOTSTIGG 6AATGGAGCA 1260 
15 TGATGGACAA GGCAACAQGT QTGGTQATGA OACtGCTATG GGAAGTGTCA TGGCTCCCTT 1320 
GGTAGZUU3C3L GChXTOCA^TC GITAOCACIG GTCOOQATOC AGXGGTCftAG AACTGAAAAG 1380 
ATATATCCftT TCCTATOACT OTCTCCTTOA TGACC3CTTTT GATCATGATT GQCSCIAAACT 1440 
CCCSUSAACIT CCTGGAATCA ATTATTCTAT GGATQAGCAA TSTCSSXTTTG ATTTTGGTGT 1500 
TGGCTATAAA ATGTGCaCC33 0GTTCCC5AAC CmXSAOOCA TGTAWLCftGC TGTOSTGTAS 1S€0 
20 CCATCCIOAT AATCSCCTACr TXOXSXAAGAC TAAAAAGGGA CCTCCACTTQ ATOC3GACTGA 1620 
ATGTGCTGCT 6GAAAATGGT GCTATAAQSB TCaVITGCATG T6GAA8AATG CTAATC3U3CA 16B0 
AAAACMU3AT OC3CAA1TGGG GGTCATGGAC TAAATTTGQC TCCTOTTCTC GGACATGTGG 1740 
AACTGGTGTT OC?TTTCAGAA CACaCCAGTG CAA^TAATCCC ATGCCCATC&> ATrGGTGGTCA 1800 
GGATTSTCCr GGXGITTAATT TTGAGTACCA GCTTTGTAAC ACAGAAGAAT GCCAAAAACA 1860 
25 CTTTGAGGAC TTCAiaftl3Cft.C AGCAGTGTCA GCAGGGAAftC TCCCACITTG AAT2\JCC2USAA 1920 
TAOCSMCAC CACTGGTTGC GATATGAACA TCCTGACCCC AAGAAAAGAT GCCACCTTTA 19 BO 
CTGTCAGVCC AAGGAGACTG GAGATGTTGC TTACATGAAA CAACTGGTGC ATGATGGAAC 204Q 
GCAGTGTTCT TACftAAGATC CATATAOCAT ATG^TGTGGGA GGAGASTGTG TGAAAOTGGG 2100 
CTOTOATAAA GaARTTGGTT CTAATAAGGT TGAOSATAAO TGTCGTGTCT GTGGAGGAGA 2160 
30 TAATTOCCAC TOCGQAACCSQ TGAAOaGGAC A<rrTACCAGA ACTCXXAKSQA AGCTTGGGTA 2220 
OCTTAAGAXG TTTGATATAC GOGCTGGGGC TAGACKIGTG TTSMStCCMaS A3USAGGAGGC 2200 
TTCTOCTCAT ATTCTTGCTA TTAABAACCA GGCTACAGGC CATTATATTT TA2ATGGCAA 2340 
AGOaOAOQAA GCCAASTGGC GGACCTTCAT AGATCTTGGT GT6GAGTGGG ATTATAACAT 2400 
TGAftG/LTGAC ATTGAAftOTC TTCACACCGA TGGACCTTTA CATGATCCXG TTATTGTTTT 24 60 
35 GATEATACCT CAAGAAAATG ATACCCGCTC TAGCCIGACA TATAAGTACA TCATCCATGA 2S30 
AGACTCTGTA CCIACAAtCA ACABCAACAA TGTChTCCAG GAA6AATTAG ATACITTTGA 35B0 
GTOGGC1TT6 AAGASCTGGI CTCAGTGTTC CAAACSaCTOT GGTGGAGGTT TCCaU^TAGAC 2640 
TAJUmOGGA TGOCSTAGGA AAAGTGATAA TAAAATCGTC CATCGCAGCT TCroTQAOGC 2700 
CMCMAAAG CaSAAACCTA TTAGACGAAT GTCCAATATT CAWSftGTGTA CACATOCACT 2760 
4U CTGGGTAGCA GAAGAATGGG AACACTGCAC CAAAACCTGT GGAAGTTCTG GCTATCAGCr 2&20 
TGBCACIQIA OGCTGGCrCC AGOCACTCCT TGATOGCACC AACOQGTCTG TGCACftGCAA 2660 
ATACTGCATG GGTGACOGTC CCSQAGAGCCO CGG6G0CXGT AACAGAGTGC OCTGCCCTQC 2940 
AJCAGTGGIVAA ACAGGACXJCar GGAGTGAGTG TTCaGTOACX: TQCGGTeAAG GAACXSGASGT 3000 
GA6GCAGGTC CTCTGCAGGG CTOGQQAiCCA CTGOGATGGr GAAAAGCXTG AOTO3QTC!Aa 3060 
45 AGCXZIGTCAA CTGCiCrOCTT GTAATGATGA ACCKTGTTrG GGAGACaMGr CCASrATTCTG 3120 
TCftAATGGAA eTGTTGQCAC GATACTGCTC CATAOCSUSGrF TATATkCAAGT TATGTTGriGft 31B0 
OTCPTGC3U3C AABGGC^ABTA OGACCCTGCC AGCAGCATAC CTTCTM12UU3 CXGCCGAAAC 3240 
TCATGMGAT GTCATCTCTA ACCCTAOTGA COTCCCTflGA TCTCTAGTGA TSCCTACATC 3300 
TTTaOTTCCT TATCRTTCftfi AQACOCCTGC ARAGAftGATG TCTTOaftGTA GCATCTCTTC 3360 
SO AGTG6GAGGT GCAAATGCAT ATGCTGCTTT C3USGCGAAAC AGTAAACCTG ATOGTGCIAA 3420 
TTTACQCCaS AGGAGTQCTC AGCAAGCAGG AAGZAAGACT OTGAGAGTGe TGACOGTACC 3480 
ATCCTCCCCA CCCACCAAGA GGGTCCACXTT CAGIITCASCT TCACAAATG3 CIGCTGCTTC 3540 
CTTCITTGCA GCCAGTGATT CAATAOGTGC TTCTTCrcM QCftASAAOCT CAAAGAARGA 3600 
TQ3AAAGATC ATTGACAACA GACOTCX^GAC T^AGATCATCC ACXriTAGAAA GATQAGAAAG 3660 
55 O^GAACCAAAA AGGCTAGAAA OCAGAGGAAA ACCTQGACAA OCTCICTCTT COCA7GGTOC 3720 
ATA1QCITOT TTAAAGXSGA AATCTCTATA GATOGTCAOC TCSVITTIATC TOXMTSGGA 3780 
AGh2VCAG9U\A GTGCIGGCTC ACTTTCTAGT 'lUCl'lTCATC CL'CCriTTST TCTGCATTGA 3B4D 
CTCATTTACC AGAATTCATT GGA2^GAAATC ACCAAAGATT ATTACAAAAQ AAAAATATGT 3900 
- TGCIAAGATT GTGTTGGTCG CTCTCrnOAAa CAOAAAflOeG ACTGGAAiOCa ATTGTGCATA 3960 

60 TCAlSCTGACZ TTITXSTtrCGX TTTAGAAAAG TTACAGTAAA AATTAAAAA6 AGATAO CAAT 4020 
GGTCTftCACT ^AAdUUgiA ATTTTGGATA TGGAACAAAG AATTCTTAGA CTTGTATTCSC 4080 
TATirA-XCTA TftXTAGAAAT AT3GTATGAS CAAATTIGCA GC Tgil 'GirGT AAATACTGTA 414 D 
TATTGCAAAA ATCAOTATTA TTTTAAGAGA TGTGTTCTCA AATGATTGTT TACTATATTA 42 OO 
^ CATITCTOGA TGTTCTAGGT GGXrTGXOGTT GAGTATTGCC TTGTTTQACA TTCTATAG3T 4260 

65 TAAXrrrCAA AGCAGAGTAX TACAAAAGAG AAGTTAGAAT TACACSCTACr GACAATATAA 4320 
AGGGTrXTGT TOAATCAACA ATGT6ATAC3G TAAATTATEAG AAAAAGAAAA GAAACACAAA 4380 
AGCZATAGAT ATACSIGATAT CtVSCrXhCCT ATTGGCTTCT ATACTTATAA TTXAAASGAT 4440 
TGGT6TCTTA GTACACTTGT OGTCACABGG ATCA2U3GMT AGITAAATAAT GAACTCOTGC 4500 
AAGACAAAAC TGAAACCCTC TTTCCftGGAC CTCRGTAGGC ACC3OTTGAC30 TGTCCTTTGT 4S60 
70 TTTTQTGTGT GTGlXSTTtn^ TTTTAATrTT CSGCATTGTTG ACAGAXACAA ACAGTTATAC 4620 
TCAATGTACT GTAnTAATGG CAAAGGAAAA AGITTTGGGA TAACTTATTT OTATGTTGGT 4680 
AGdGASAAA AATATGA9X3V GTCTAGAATT GATATTTGAO TAXAGSAGAG CXTTGGGGCT 4740 
TTGAAGGCAG GTTCAAGAAA GCATATCTTOa A3GGTTGAGA TATTTATTTT OCATATOGTT 4B00 
„^ CATQTTCAAA TGTTCACAAC CACAATGCAT CTGACTQCAA IAATGTGCTA ATAATTTATG 4860 

75 TCAGXAGTCA CCTTGCTCAC AOCAAAQCCA GAAATGCTCT CTCCAGGGAG TAGATGTAAA 4920 
QTACrTGTAC ATAQhATTC3^ GAACIGAAGA TATTTATTAA AAGTIGATTT TITTTTCTTG 4980 
AiaGTATTTT TATGTACTAA ATATTTAC3VC TAATATCAAT TACATATTTT GGTAAACTAG 5040 
AGAGACATAA TIAGAGATGC ATGCTTTGTT CTtTTGCATAG AGACX::m!AA GCAAACTACT 5100 
ACAGCCAACT CRAAAGCTAA AACTGAACAA ATTTGATGTT ATGCAAACAT CTTGCATTTX S160 
oU lAGTAGTTGA TATTAAGTTG ATGACTTGTT TCCCZTCAAG GAAACATIAA ATTGTATGOA 5220 
CrCAGCTAOC TfiTTCAAXGA AAarCGTGAAT TOSAAAC3WTT mAAAAGTT TTTGAAAGAG 5280 
ATAABTGCAT CATGAATTAC ATGTACATGA QAOGAGATAG TGATATCAOC ATAATGATTT 5340 
7GAGGTCAGT ACCXGAGCIG TCXAAAAATA TATTATACAA ACTAAAAXGT AQATGAATTA 5400 
ACCTCrCAAA GCACAGAATG TGCAAOAACT TTIGCATTTT AATCOTTGTA AACXAACA6C 5460 
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TTAAACTATT 

CCTAAAAATT 
CCaCAGGGGA 
ATACTTCTGA. 
TACACTACTG 
AT 



GACTCTATAC CTCTAAiySAA TTGCTI3CTAC TTTGTGCRAXS AACTTTGAAG 
aCAAATTOCa GATAGTAAAA CAATCCCTAA GCCTTAftSTC TTTTTTTTTT 
CCCATAOAAT AAAATTCTCT CTAGTTTACT TGTGTGTaCA TACATCTCAT 
AGATAAAGAT GGTCACACAa. ACAGTTTCCA TAAAGA-TGTA CATATTCATT 
CCTTTGOaCT TTCTTTTCTA CTAAGCTAAA AATTOCTTTT TATCAAAGTQ 
ATGCT6TTTG ^fTGTACTGAB ACiCACGTACC AATAAAAATO TTAACAAAAT 



Se<3 ro NOt 528 Protein ggguence 
Protein Accession, ft: XP 035683 



1 
I 

HVIiIi5LHI*IA 
L3ASBKKRSA 



11 

\ 

AAI»V£VRTSA 
S0VSSNPBQL 
SATYRISICIEE 
XEPLBRGKOK EEBXIffiLllIW 
AQBHDyHXEV 
JjGYAICSISIjI 
pvtohchfvr 
AATHRYHSrSR 
KMCTAFHTFD 

DFAAQQCQQR 
SYKDPygXCV 
MFDIPPGARH 



IHWliVRHIM 

FaPAOMQayA 

HGSVHftPItVQ 
QCaFDFGVOY 

MTBEC3QKHFB 
VQL-VHDOTHC 
SXSRKLGYLK 
GVBHDUIZED 
QEEUXFFEnA 
lOECTHPLWV 
CNKVPCPAQW 

RSLVMPTSIAT 
TVSIiVTVPaS 
STLER 



LKSMSQCSKP 
AEfiHGSCTK:!? 
KTGPWSECSV 
SVLARYCSZP 
FYESETPAKK 
PPTKRVHL&& 



21 

1 

DCSOAGWEEMV 
PENITAPQKD 
BLQISCAYVG 
VKKSAVEQAP 
IiLGVDDSWR 

CSQQB[*KRYI 
PCKOLWCSHP 

NSHPEYQNTK 
S0BCVKVOCD 
VLIQEDEASP 
UBDFVXVLXX 

CG^SGYOLKT 
TCQECfTEVRQ 
GYMKLOCSSC 
KSIiSSTSSVG 
AGQKAAASFF 



31 



41 



QIDLPIKRyR BYELVTPVgr 
PHUUiKSHTQ LVAPGAWSW 
DrVDIPGTSV AI8NCDGIA6 



51 
I 

NiEGRYIiSHT 
HBISLVPGHI 



FHGKEHVQNY 
WCRWASQQQ 
BBAFWAHST 
HSYDdiLDDP 
raiPYFCKTKK 
VSPRTHQCHN 
UHHLPYBHPP 
KE1GS£3KVED 
KXIAIKHQAT 



LIiTLMNXVHE 
RSDLNHSEHH 
6KVI£MEHDG 



VRCLQPLLDG 
VLCRAODHCD 
SKRS8TLPPF 
QPHAXAAFRP 
AASDSIGASS 



GFPLDGTBCA 
PMPINOGODC 
PKJKRCHLYiCQ 
KGGVCGGEffiTS 
GHYIU«3KD& 
TYKYIIHEDS 
VHR£FCEANK 
TNRSVHSKYC 
GBKPESVRAC 



GTVYCajIHQQ 
lYHDRSLQVH 
DHAIFIfTRQD 
QQNROQDSTA 
XiPGIMYfiMDS 
AtHORCYKOKC 
PGVKPEyQLC 
SKBTOUVAYH 
HCRTVKGTPT 

VPTTNSMMVI 
KPKPIRBHCN 
MGDRPE3RRP 
QLPPCNDEPC 

Qfi&AQQhSSK 



Seq ID NOs S29 DMA segucnce 
l^uoleici Acid Accession #: llMj002774 
Coding seqaencet 346.. 9B0 ~ 



1 

i 

AOGOGGACAA 
AGCCCQOOGC 
GCTQGCTOGC 
GGCAGCTKjrr 
OGGCCATGSU^ 
AOAAIAAOTT 
TCTACACCTC 
CAGCTQCCCA 
AAAGGGAGAJ5 
ATQCCXSCCAS 
CTGAACrCAT 
ACATCCTGGG 
ACATCCAOCT 
ACATGTTGTG 

GATCSU^AdSGA 
AAAOOVTTGA 
TGGCTGSTTC 
TOACCCIGAT 
CAGCTGCATC 
TACCCCC3«:C 
TTCCACACX5T 
ClGCTTACTCr 
AAflATBAAQA 
TTTCAChCCT 
TATTTT 



11 
I 

AGOCOGATT6 
A06GGCOGGG 



GAAGCIGAT6 
93TQCATQQC 
OGGOCACTT6 
CTQCSUUUIAA 
TTCCCAGGAG 
OCATGACX:!Aa 

ccAQocccrr 

GTGGG6CAAG 

TGCTGGGGAT 

GAAGOCA6GA 
GGCCAAOTGA 
CAGAACGTCT 
GCTTAATAAA 
CTTTGCATCAC 
ACXAAGAGAA 
TTBATTTCTT 
TGGGTGTC3CC 
TAAOOATGAT 
ATGACATACA 



21 
I 

TTOCTGGGCC: 
GCCAGrGTGG 
GGACACAGAO 
ACTCAAGAAT 
GTGGIGCTGA 
QGACCCIGOG 

CCX3AATCTTC 
CAGAGTTCra 
GACATCA.T6C 
CCGCrGQAQA 
ACAGGAGATG 
GAGOAQTOTG 
GAGAAGTAOG 
CACCTCCtSaO 
GTCTACACICA 
CCCTGACATG 
CTCAOCTASA 
OSCAGGGACG 
TGOQGASGAC 
TACAOGAAAA 
CCTOGAOAOG 
CTTGGGATGT 
ACAQTCTCCA 
TGGGATAGCA 



31 
I 

CTITTCCCCAT 
TGACACAOGC 
GTCX^QEZAGGC 
CCCCGGAGGC 
GTCTGATTGC 
ACftAGACATC 
GGGTCCTTAT 
AGGTCTTCCT 
TTGTCCQGKaC 
TGlTGGGCCr 
GOaAiCTGCTC 
6TGATTTOCC 
AGCATGCCTA 
GGAAGQftTTC 
GCCrTGTGTC 
ACGTCTGC7VG 
TGACAHrCTAG 
OCTT0OCTCC 
TGAGGGTCCT 
GTOATGAOTQ 
TCCCTTCTAG 
CCCAGCCACG 
AOCTTTCTTC 
7CAGGCAGTG 
CCTGGGCOGC 



41 
I 

OGGGOCIGGG 
TGTAGCTGTC 
ASCACACSVQA 
COGGAGGCCT 



TCAOOCCIAC 

GGGGAAGCAT 
TOIGATCCAC 
G6C3U:G0CCA 



51 

1^ 

tccccggctq 
GGGACCTAGG 
GCAGCAGGAG 
aCAGAGOAGC 
CAAGCXGCCC 
TGOGTOCrCA 
AACCTTOGGC 
CCXGACTATG 



TGACACCATC 
OCCIGGGCAG 
CTGCCAGGOT 
ATGGGGTAAC 
ATACACQAAC 
CICCC2SACCT 

GATTCTCCCT 
AGGACTTOQG 
6CATCTOCTC 
TCSrCTQGAAT 
ACTGCAGATT 
GCIGTTGGAA 
CATQCACICA 



S520 
5580 
5640 
S700 
57G0 
5620 



AGCnSCIGCC 
CAGTOTOCAT 
ATCAOOCAGA 
OATTCT9GQG 
ATCCCCTGT6 
TGGATCCAAA 
ACCACOCCAC 
TGCCCRGCTC 
GGTTTTATOC 
atXITCGGTCr 
TCCCCRACCC 
CGCAGCTCOG 
TCTCSiOCTGX 
AGATTTAAGA 
ATAAAC3AATG 



Se«i ID HO: 530 Protein geguence 
Protein Acceseion. NP_0027e5 

11 



€0 
120 
ISO 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
IBQ 
040 
900 
960 
1020 
1080 
1140 
1200 



60 

120 
180 
240 
3D0 
360 
420 
480 
540 
600 
€60 
720 
780 
840 
990 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
X440 
ISOO 



1 11 21 31 41 51 

f I 1 I I 1 

MKKLMWItSI. lAIkAHABEQN KLVEGGPCDK TSHPYQAALY TSGHLLCGQV LIEPI^TVLTA 
AHCKECPNLQV FLGKHNLKQR ESSQBQSSW BAVIHPDYDA ASHDQDIHLL RLftRPAKLSE 
IjIQPLPL&RD CfiAHTTSCHI LGNGSCTADGSD PPDTIQCAri HLVSREBCBB AYPGQITQHH 
liCAaDHBCZGK VSCQGDSGGP liVCGDHIAGI* VSRGmPOQS KEKFGVYHIV CRYTHHIQKT 
XQAK 

Geg ID MOi 531 DMA Bgquence 
Huclsic Acid Acce&eioa #s BM_012152 
Coding sequence £ 43.. 1104 



60 
130 
lao 

240 



11 



21 



3X 



51 
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CTTCTTTAAA 

GGAACAAAGC 
TCTAATTCTC 
TACCTGTT6G 
ATOTTTAACA 
GGGCTTCTGG 
AGGCACATGT 
CTGCTCATTT 
TGGTUikTTGCXI 

CTGCGGATCT 

GOGX!rTGTGC3 



TGGACTTTTT 
TTGTGATTaT 
TGGTCATCXSC 
CTAATTTAGC 
CAGGOOCftGT 
ACAOTAQCTT 
CAATCATGAG 
TGCTTGTCTQ 
TCTGC24ACAT 
TCTGGACAOT 
AOGT6TACX3T 
J3CC0OAGI3Aq 
TATGCTGGAC 



TTATAATAG6 
TTTGTOTOTT 
QGCAf^TGATC 
TGCTGOOGAT 
TTCAAAAACr 
QACTGCTTOC 
GATGCGG<3TC 

GGCcaroGCC 

CTCTGCCTOC 
OTCCRACCTC 
CAAGAGGAAA 
ACCCATGAAG 



GTGGTGAACC 
ATOATCTGCT 
GTCCTOiGCA 
6TCTGCAATA 
GTCTXAG6 



CCAICATCTA 
GCTTCTCTCA 
<3GAGTGACAC 
AAAGCA^TTC 



TGTGAAAAGC3 
CrOCTACAAG 

AGOCAGCCAO 
CTAAACTCTQ 



TTCTTCTCCA 

ACCaAOVCTG 
□GQAOQTTTT 
AAAAACAGAA 
TTCTTOQCTO 
TXGAC16TCA 
CTCAOCAACrr 
CATAGCSVACC 
ATTTTTATGS 
TCTTCOCTQO 
A'MGCCTTCC 
ACCAAOQTCT 
C7AATGAAGA 

IGGTTCCTGC 



GAS7LGGCGTC 
TACATAGAGO 
GATGCCTCTC 



I 

CAATGAATGA 
ATACTSTOSA 
TCIGCCTOTT 
AATTTCAITT 
QAATT6CCTA 
ACCGCT6GTT 
TGCTGGITAT 
T6ACC2U\AAA 
GGGCGGrCCC 
CCCCCATTTA 
TCATCATGOT 
TGTCTCCGCA 
COGTGATOAC 
TCXrxGGAOGG 
TGGTGQGQCr 
T6TATGGCAC 
CCrCTGQCAT 
ATAGTATTAS 



GT6TCACTAT 
TGACTGGACA 
TATTTTTTTT 
CCCCTTCTAC 
TQTAITCCTG 
TCTCCGTCAO 
CXSOCOXGGAG 
GAGGGTGACI^ 
CACACTGGGC 
CAGCAG6AGT 
TGTGGTGTAC 
TACAASTOG6 
TGrCTTAGGG 
CCTGAACTGC 
^TCRACrOC 
CATGM6AAG 
CCCCrCCACA 
CCAAQGTGCA 
QOTQATGACT 



Seq ID WO; 532 Protein seqiaeoce 
Protein Accession #i IdP_0362a4 



1 11 

1 ) 
MEIECHYDKHM DPFYHSSHTD 
FfiFPSYYIiIiA I9IAAADFFAQ 
ZiVIAVERHKS XMBMRVHSHI» 
PTYSR9YEiVP HTVSKEMA^ 
VHTVZiQAFW OrTFGIiWIfL 
YGTMKKMICC PSQENQPERRP 



21 31 41 SI 

1111 

TVroWTGTKL vm-CVOTFP CUFlFPSHSIj VIAAVIKigRK 
lAYVFUflEMT GEVSKTI.TVH RWPI*RQQr.IiD SSLTASIVTMIj 
TKKRVTIiI*II» liWAIATFMG AVPTUSWMCL CHISAC8SIA 
rnVWYLftXY VYVKIUCiamj spbtsgsisil bstbmkihrt 
LDGLNCROCG VQEVKRWPLIi LALOTSWSIF IIYSYKDBDM 
SRXPSTVIi&R SnrSGaQYlED SISQGAVCHK 5TS 



60 
120 
180 
240 
300 
3€0 
420 
480 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 
lOBO 
1140 



Seq 7I> NO: 5^3 CMA sequencfe 
Kucleic Acid Acc3eS9ion #s lliH_00282X 
Coding seguencei 150.. 3362 

I 11 21 31 41 51 

I I I I 1 1 

AAcrcGcooc TcoGoncacc TcaaasTCSG gccgoggctg cggctgctqc 

GGGCTGCGCT GOGTGGGOCT OCIGIOOCOG COQO(3GaaCA OTCrGGGGCC GGOCGTGOGC 
CCTCAGCrcC TTTTCCTOAO CCOSCCGOGfL TGGGftGCTGC GOGGGGATOC CDOQOCaiGAC 
CCXX3CCQGTT GCCTCTGCTC AGOGTCCTGC TOCTGOCGCT GCTGGGCQGT ACCCAGACftG 
CCATTGTCTT CATOkAGCAG OOSTOC TOOC ABSaTSCagl GCAGOGGGGC GGGGCGCTGC 
TTCGCIGTGA GGntSASGC^ GC8QGC3CGQB TACKTBIGZA CTQaClGCIC GATOOOGCOC 
CTSTCCIUSGA CAOSGAiGCGG OGTTTCOCCC A0BGCAGCSU3 CCTGlAGCrTT GCAOCTOTaa 
ACaSGCfTGCA GC5ACTCTGGC ACCTTCCAGT GTGTGGCTOG QQATQATQTC ACTGGftGABG 
AAGCOCGCAG TGC3CAACQCC TCCTTCAAiCR TC31AATQGRT TCAGGCROCT CJCTQTQQTCC 
TGAAGCavrOC AG0CTG6GAA GCTGAGATGC AGCCftCAQAC CCAGGTCAiCA C TTOGTTG OC 
ACATTOKIGG GCAGCCTOOG OCCACCTIUC3C AftTGGITCOG ASAT6GGACC COOCTTTCTG 
AIG6TCAGA5 CAACCACACA GTCAGCAGCA AGQAGCXSQAA CCrrGMSQCTC G6GCCAGCT6 
OTCCTQAGCA TaGTGGGCM IMICCTGCT GOGCCCACRG TGCTTTTGGC CAXSQCTSXSCA 
GCAGCXZAGAA CTTGACCTTG AGCATTGCia ATGAAAGCTT TGOCAGSSTG GTGCIGQCAC 
COCAGGACGT GGTAGTAGCS AG8XATGAG6 AGGOCATGIT Cf^TTOCCAG TICTCA6C3CC 
AGOCaCCCCC GAGGCTGCAO TGGCTCITTO AGGAXGAGAC TGCCATCACT AAOOGCASTC 
GCC(3CCCACA CCTCGGCAlGA GOCACAGTGT TTGGCAftOGG GTCTCTBCTO CTGACCCAGG 
TCOGGCCACG CAATGCAGGG ATCXACOSCT GCAa:!rGGCCA GGGGCAGAG6 GGCOCACOCA 
TCRTCCXGGA AGCCACACTT CftJCCTAGCAG AGAITGftAOA CATGCOGCTA TTTGAGCCAC 
GGGTGT^tTAC AGCTGGCAGC GAGOaGOGTG TGACCtrGGCI TCCGOOCAAG GGTCTGGCAG 
AGCGCAGGG7 GTGGTGGGAG CAOSOGGGAG TOOGQCIGCC CACCCAXGGC AGGGTCZAOC 
ASAAOGOCCA QGAGCTGSTG rrGGCCAAXA TO»:TGAAAG TGATGCroaT GTCIACACCT 
GCCAOSOGGC CAACCTGGCT GG1X»GQGQA GACAGOfiTGT CAACATCACT GTGOCCACIG 
TQCCCIOCIG GCTGAAiSAAG CCCCAAGACA 6CGAGCIGGA GGAGGGC!AAA CCGGGCTACT 
TGGATTGOCT GACCXAGGCC ACACCAAAAC CTA£!ASCTGT CTG6TACAGA AACCAQATGC 
TCATCTCAGA. GGACTCAGGG TTGGAGGTCT TCAAGAATGG GACCSTGCGC ATCAACAGOG 
TGGAGGTGTA TGAIGGOACA TGOTACXSGTT GTATGAfiCAG CACCCCAGCX: GGGM3CKICB 
AGGOGCAAGC CGGTGTOCAA GTGCTGGAAA AGCTCAAOTT CA<::ACCACCA CCCCAGGCAC 
AGCAGIGCAT GGASTTTGAC AAGGAGGOCA OGGTGCCCTG rTCAGCCACA GGCOGfiGAGA 
AGOCCACTAT T21AGTGOGAA CGC3GCAGATG GGAGOUSOCT CCCAGAGTGG QTOACAGAOL 
AI3GCEGGGAC CCTOCATTIT GCCGSG6TGA CTOGnOAXGA CGCXGGCAAC TACACSIGCA 
TTGOCTCCAA CGGGCGGCAG GGCX3USATTC GIGCOCSVTGT OCAGCTCACX QfSGGCAGXTX 
TTATCACXrrr CAAAOXGGAA GCAGAGOGTA CGACTGTOTA OCAGGGCCAC ACAGCOCTAC 
TGCaOTGCQA GGCCOUSGGQ QACCCCaAGC OGCTGATTCA GTGGAAAOGC AAGGAOCGCA 
TOCIGGACCC CACCAAGCTG GGACGCA0GA TGCACATCTT CCAGAATGGC TCCCTGGTGA 
TCCATOAOGT GGCCCCXGAG GACTCSUMGC GCIACACCTG CATTGCAGQC AACSveCXGCA 
ACATCAAGCA CACGGAGGCC GOOCTCTATG TCSSTGOACAA GOCIGTGOOS GAGGAGTCGG 
AGGGCCCTOa CRGCCCTCCXI CCCJEavCAAGA TGATOCRGAC CATTGGGTTG TOBGTGGGTG 
COGCIGTGGC CTACRTCATT CCOGTGCTrGQ GCCrTCATGll: CTACTGCAAG AAGOGCTGCA 
AAGCCAAOGG 6CTGCAGAAG CHGCCOGAGG GOGAGGAGCC AGAGATGOAA TGCCTCAAGG 
GAGGGOCTTT GCAGAAGGGO CAGCOCICAG CAGAGATGCA AGAAGAAISTG GCCTTBACCA 
GCTTGGGCTC CX9GCCCCG0G GOCACCAACA AAGQCCACAG CAOUWaTOAT AAOATGCACT 
TCCCAOGGTC TAGGCIGCAG CCCATCACCA GGCTGGGGAA GAGTGAGTTT GGGGAGQXGT 
TCCTBGCAAA GGCTC3U3GGC TTGGAGGAGG OAGTGGCAGA GAOOCTCGrA CTSGTGAAGA 



60 
120 
180 
240 
300 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
lOBO 
1140 
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1260 
1320 
13 BO 
1440 
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1600 
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1920 
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2040 
2100 
2160 
2220 
22B0 
2340 
2400 
2460 
2S20 
2580 
2640 
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<3ECTI3CAGAC 
GGAAQCTGAA 

actacatggt 

AGAGCAAGGA 
QCACCCAGGT 
TGGCTGaaCQ 
TCAGCAAGGA 

CCTTCGGIGT 

GCTGCCCITC 
GGCCCTOCTT 
GAGGRGGQRB 
CAGCATGATG 
TTGCtG2«5QT 
GGCTGACTTG 
CTCXTCCTCT 
TTCITOCCTT 
AGGCTTGGGA 
AGGGTTAATG 
ACACAQCAAQ 

ccsocncccTr 

CTTTTGACAC 
TC?C2VGCX3IGG 
GCCATGCITA 



GAAGGA.TC3AG 
CCACGCCAAC 
GCTGGAATAT 
TGAAAAATTO 
AGCCCTGGGC 
TAACTGCCra 
TGTGTACSiAC 
CCCOGAGGCC 
GCTGATOTGG 
AGTACTGGCA 
CSUACTCTAT 
CA<3TGft6ATT 
CCCQCICAGG 
GGCAAGA.TCC 
CTQftQCAQGtQ 
GAOCCJ^CT 
ATCAGOOACA 
GAOCGGSTCC 
TOAGCTGOOT 
AGTCTCTTGC 
TOAQTCCTCC 
CTCTOCTTTC 
TATATAAACC 
GGTOSGTGGG 
COOCftCACTT 
T^ACACTGS 



CAC3CAGCASC 
GTQGTGCGGC 
GTOGATCTGG 
AAOTCSiXZAaC 
ATGGA6CACC 
QTCAGTQCXE 
AGT6AGTACT 
ATCCTGGAGG 
GAAOTGTTTA 
GATTTGCAGG 
OC3QCTSATGC 
GCCAGCGCCC 
ATGGCCTGGG 
CTGIOCXGCT 
CCrW3C3CTTT 
GGGCGACTAG 
GTGTGGGTGC 
AACTCTGCCA 
TTaTGGQQAG 
CCRCTGGTCC 
CCACTCTGGG 
CTCATGCTAA 

CA^GIUSGT 
TTATTQTTQT 
CTGCTCICAA 



TGGACTTCCG 
TCCTGGG6CT 
GAGACCTCAA 
CCCXCAGCAC 
TGTCCayWM. 
AOAGACAAGr 
ACXa^CTTCCG 
GTGACTTCTC 
CAOW^SGAGA 
CIGOGAAGGC 



TOGGAGACAG 
CAGGGGAGGA 
GGGCCCTGAG 
CCTCCTCTTC 
GGCTMGASC 
CACAGGTAAC 
CTCATCTGCC 
TTOCTTAATA 
ACTTOSGGGT 
CTTGTGCACA 
GIGCCTOGCA 
TATQCACCAC 
AiSGGGTGGGC 
OOT'in.Ti'TTGT 
TAAATAftGCG 



GftGGGAGlTG 
GTQCCGGGAG 
GCA6TTCCTG 
CAAGCAOAAG 
CCXSCTTTGTG 
GAAGGTQTCT 
CCAGGCCTGG 
TACXAAGTCT 
GATGCCCCAT 
TAGACTTCCT 
GQCCCTCAQC 
CACJOSTQGAC 
CATCTCTAGA 
GIGCGCIAGT 
CTCACOCTCA 
TGC3GCAGTTT 
CCCAATTTCT 
AACTTTGCXIT 
TTCTCRAQTT 
CTAGACCAGG 
CTGAOCCAGA 
GA1GAAC9GAG 
GG€C@GCTTT 
CCTOGAGATG 

TTTTTTA 



GAGATGTTTG 
GCIGAGCCCC 
AGGATTTCCA 
GTGGCCCTAT 
CATAAGGACT 
GCCCTOQQCC 
GTGCCGCTGC 
GAT0TCTO3G 
GGT6GGCAGG 
CAGCOCGAOG 
CtX^iAGOACC 
AGCAAGCGGT 

gggaagctca 
gca2u:aggca 

TCCTTTGGGI^ 
CCCCTGOCAC 
GGCCTTCAAC 
GGGGAGGGCr 
CTQGGCACAC 
ATTATAGACSG 
CCCAOGTCTT 
TmCAGGftG 
TATATOTAAT 
AGGAGGGTG6 
TTTTTQTTTT 



Seq ID KOf 534 Prqfcein aeqagnce 
PEOtein AoceBBion #t NP_002ai2 



1 
1 

NGAARGSPAR 

vmrmLLtiGA 

IKWIBAGPW 
KERtJl/TIiRPA 
EAHFHCQFSA 
CI(^G0R<3PP 
VRIiPTHGRVy 

C34SSTPAGSI 
G&5LPEHV1S 
TrVYCOTTAI* 
KfTCrAGNSC 
GLMFYCKKRC 
KBHSTSDKMH 
LDFKREIiEMF 
PliSlXOKVAI. 
'XHPRQAHVPL 
AG2CASLPQPB 



11 
] 

prrlpllsvi* 
pvqdterrfa 

ZJOIPABEAEI 
QFEHBOIiYSC 
QFPPSIiQNLF 
IXUBATI^HLA 
QKGHELVIiAN 
USCLrrQATPK 
EAQKKVQVXiB 
21AC3TIiHFARV 
LQCEAQGDPK 
HZia]TEAPI.Y 
KAKIOiQKQFE 
PPRSSLQPIT 
GKLI!IK3VHVVR 
CTQVALGHEH 
RHHSPEA1I.E 
GCPSKLYSUf 



21 
1 

LLPLLGGTQT 
QGSSLSPAAV 
QPQTQ\m,RC 
CAHSAFGQAC 
EDETPITNR8 
ElBDMPIiFBP 
IJffiSDAGVYT 
PTWKYRNC^ 
KLKFTPPPQP 
TBDnAGNYTC 
PI.rQWlQSXDlK 
WDKPVFEES 



TLGKSEFGEV 
liliGliCRSABP 

GDFSTKfiDVH 
QROm^SPKD 



31 
I 

AXVFIKQPB& 
DBIiQDSGTpQ 
HIDGHPRPIX 
SSt^TLBIA 
RPPEXfRSATV 
RVPTAGSESR 
CEAASILAGQR 
XiZSEDgRFEV 
QQCM&FDKEA 
ZASEIGPQGQX 
XU^PTKLGPR 
EGPGSPPPXK 
□OPLQlNaQFS 
FLAKAQ6LE8 
HYHVIiBWDfli 
LAASNGLVSA 
AFGVliMWEVF 
RPSFSEJASA 



41 
1 

QDALQQRRAL 
CVARDDVTGE 

PB^ABVVIA 
FANGSLLIiTQ 
VTCI.PPKGLP 
HQDVNITVAT 
FKKIGTIiRXNS 
TVPC8ATGRB 
RAHVQI/TVAV 
KHIFQKGSLV 
MJOTZGI-SVO 
AEIQSEIVAIjT 
GVAETLVLVK 
CEJLKOFIiMS 
QRQVKVSALG 

thgemphggq 

IGDSTVDSKP 



Nucleic Acid Accession H: K1M_013952 
(Vifljyig segisieaac»: l£l..ia57 " 



1 
I 

TTCAGAAQGA 
^xaAGGGCCTQ 
AGCCCGGAGC 
ATCIGQOCAT 
GGAAU'iUUTC 
CTCT080CAG 
GACTOaCAGC 



CCGAOACGCaG 
CMTi'AATAGA 
CGTGGCCACC 

CATC3GCTCAG 
ACTAAGCATT 
CTTCAGCCAG 
GOCCIA^CC 
CAftCAGCRCC 
CAACCICICG 



LI 
I 

GGAGAGACAC 
CAqcCGGCCG 



OBCCftGOGCA 
CICOGCGTCA 
ATCCX3GOCTG 
AAGATTGGQG 
CTCCXGGCTG 
ATCAXOCXSGA 
A3K3TCCCTGA 
OCCOU3TOGG 
OCTGGCAGGG 
QRCTCACAOA 
CACCA0CT06 
TCCC2CCAaoc 
CTGGA0GA06 
AiCTCACXTVGA 



CAGITCACGG GCCAGGCCCT 
TAOOCACOCC ACATOOCCAC 
A1GGTGGCAG GAAGTGAATA 
AQQQAQQCCX GQGbw4"j.'CX3C 
ACATGAAOGC GGAGXCCAOC 
TaOGGACAQT O 



21 
I 

CGGOCCCAGQ 
GCCAGGGCAG 
CXGC3GAGCX3A 
ACCIU3CTI3GG 
TCGTAGACCT 
GOCATGOCZG 
GAGTGATAGG 
ACTACRAACQ 
AOBGOGTCTG 
CCAAAfiTGCA 
GTOCX3GGACA 
ATTCCCTQOO 
ACAAlQAJSGAA 
GCAGCAQCAg 
A8COSCTGGZ1 
ACACCAAAGO 
GC5UV9GCCAC 
C3CTACCCCC3T 
GCCCTTCAAT 
CX^TCTCAGGG 
CAGOSGACAG 
CICTGGCAAT 
CAACTCCAGC 
GCCCftiXSUCX 



31 
I 

GCAOCCTCeC 
GOOCAGGOGC 
CIC0CXXX3GG 
AGGQGCXTTTT 
OQCCCAOCftS 
GOTO^CAAG 
GGGCTGCAAG 
CCAOAACCCT 

tgachaigac 

6CA7U3CATTC 
CA06CTGATC 
CTCCACCTAC 
AATGGA1Y3AG 
CGQAGCCOGA 

QQAGCAG^SGC 
GCTGAOCCCT 
GOTGCSCAOCT 
GGCTTTOOCC 
GSABAfiATGa 
GGCAGCTATG 
QCCEATOQCC 
TTGCXGAlSTT 
GCCAQGGCXrr 



41 



OGCCCGGACC: 
ATGCCTCIWCA 
OTGAATGGCA 
G6TGTAA86C 
ATGCTTGGCA 
CCCAAQGTGG 
ACCATCTTTQ 
ACTGXGCGCA 
AACCTOCCTA 
COCAGCTCAG 
TCCATCAATG 
AGTGATCA8G 
AAGCACCTXC 
GAGOGGCAGC 
CICTAOCC3GC 
TOCAACAGGC 



2700 
2760 
2320 
2880 
2940 
3OO0 
3060 
3120 
3150 
3240 
3300 
33S0 
3420 

34eo 

3540 
360D 
3660 
3720 
37BD 
3640 
3900 
3960 
4020 
4O&0 
4140 



51 

1 

LRCEJVISAE'GP 
BARSAHAS:^ 
DGQSNEXVSS 
PQDVWTVHYE 
VRPRNAGIYR 
EPSVHWEaAG 
VPSnLKKPOD 
VEVTOGTWYJl 
KPTIKHHRAD 
FITPKVBPBR 
IHDVAPEDS6 
AAUAYIIAVI. 
SUaSOPAATK 
SLQTKDEQQQ 
KSKDEKUCGO 
LSKCVmSEY 
ADtlBVtiADLQ 



51 
I 

AGCCAAaCAG 
TACGGGAGGA 

ACicxsvxcavs 

GAOCTCTGCC 
CCTGCXaACAT 
GOTACTACQA 
CCACOCCCAA 
CCIGGXSAGAX 
GIXjTCAGCTC 
TOaACAGCTG 
CIGTAACIOC 



ATGCTGOCTC 



CCTOCTCTGC 
ACRDOCCXTTA 
CCCCATATTA 
TTGAOC&TCT 



AXAGCTGOGG 
QCAOQQATGC 
ACTAOCCAGA 
TGCCCTTGCT 
CACI06GG0S 
GGAICTGCAQ 
06TGXACGGG 
GCIGCCCGGA 
GATCGSCM3GC 
CTCCTCCTAC 
TTACRGTTCC 
GTAGTT6CCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
6 DO 
660 
720 
7H0 
840 
90O 
960 
1020 



60 
120 

lao 

240 
300 
360 
420 
480 
S4Q 
600 
660 
720 
780 
840 
900 
960 
1020 

loao 

1140 
12 OO 
1260 
1320 
1380 
1440 



1160 



wo 03/042661 



FCT/US02/36810 



Seq TD SIO: 536 Protein aequeaice 
Protein Accession Hi HP 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 



ILGRYYETGS 
TVPSVSSIHR 
SIHGIJ/3IAQ 
ERQSYPSAYA 
PPPWICSKSA 



11 
I 

IRPGVIGGSK 
IIRTKVQQPF 
PGSDKRKMDD 
SPSHTK3EQC} 
PGSSPSMPFP 
EVNTIiAMPMA 



21 
I 

VNGRPLPEW 
PKVATPKWE 
NLPMD8CVAT 
SDQD&CRL&I 
IiYPLPLTJffST 
MliPPCXGSSR 
TPPTPPTARP 



31 
1 

HQRIVDUUIQ 
KIGDYKRQNP 
KSIiSPGHTLl 



IJ>DQKA.':rLTP 
ARPSSQ6ERH 
OASPTPAC 



«1 
] 

GVRPCDISRQ 
TMFAWBXRDR 
PS8AVTPPES 
KHLRTDAFSQ 
8HTPLGRNI.S 
NGPXCFDTHP 



SI 
I 

UCVSHOCVSK 
I'LABSVCDHD 

HHLEPLECPF 
THQTYPWAA 
TSPPADRAAM 



Seq LH NOt 537 DHA Bgqugaicc 

msoLelc Acid Accession #s KM_003466.l 
Coding sequemcei 11..U63 ^ 



QAATTCGGGS 
AGGGGCCTTT 
GOCCCACXaa 
C»TCnGCAA6 
GGGCXCCAAa 
CCAGAACCCr 
TGIUCAATGAC 



CAOGCTGATC 
CrCCROCTAC 
AATQGATGAC 
G06ACCC0GA 
GTOOCCATTT 

CCTGAOOCCI 
OGTGC3CAGAT 
TAGCTCCACC 
OGOGOTCXTQ 
CCTbGtjOCXTC 
CATCCSSCaCC 
AAGTGAATAC 
GOGCTTCCCC 
OAGTGCA006 



11 

1 

ATQCCTCACA 
GTGAATGGCA 
GQTGTAAG6C 
ATCCTTQGCA 

cxx:aaggtqq 

AOCATGTTT6 

AcnstiecciCA 

AAGCICCCTA 
CCCAGCTCAS 
TCCATCSU^TG 
A6TGATCAGG 
AA(5CA£3CITC 
GAfiCGQCAfiC 

7CCAACA0GC 
DCTCACTCAC 
OCTTOCTGTT 
CCCTTCAATB 
CTCTCAGGGC 
AQC(5QACAOG 
TCTGGCAAT6 
AACTCC3U3CT 
COrAGCACTG 



21 

I 

ACTCCATCAG 
GACCTCPGCSC 

ocraoaAcar 

G6TACTACGA 
CCACXX:OCftA 
CCT6GGAGAT 

<3a:GTCAQC!rc 

TGQACAGCTG 
CTGTAACTCC 

ATAGCTGCOG 

ACTACCCAiGA 

CACTGG6GC6 
CCTTCGC3CAT 
TATCTAOCTC 
CCTTTCCCGA. 
GAGAGATGGT 
GC3U3CTATOC 
OdAIGGCCA 
TGCTQA6TTC 
GCACGGCCTT 



1 

ATCIQGCCAT 
GGAASTGSTC 
CTCTCQCCAQ 
GACTGGCAGC 

COGAOAOOGO 
CAOTAATAGA 
CX3TQQCCACC 
CCOGGAGTCA 
CSTOGCTCAS 
ACTAAGCATT 
CrrCAGCCAG 
ggoctatgcc 
CAACA£3CAjCC 
CAAOCTCTCX3 
AAAGC2U3(SAA 
0GCCTTTTT6 
TOCTSOCrOC 
GGQ6CGCAOG 
CTCCTCTGCC 
CACOCCCTAC 
GOCATATTAT 
TCACCATCTG 



41 

I 

GQAGOGCTOA 
OGCCAGCGCA 
CrCOGCGTCA 
ATCOGGCCTQ 
AAGATTGGGG 
CTCCTGGCTQ 
ATCATCGGGA 
AAGTCCCTOA 
OCCCAGTOGG 
OCXGGCAGCG 
GACTCACAQA 
CACCACCTOG 
TCOCOCAGCC 
CTGGACGAC6 
ACTCACCAGA 
ACCCG0GIU3G 
GATCTaC3W3C 
GTGTAOGGGC 
CTGOCXTGGftT 
ArrCGCAGGGA 
TCCTCCTACA 
TACAGXTCCA 
TA6TTGAAGC 



51 
1 

ACCAGCTGGG 
TCGTAGACCT 
GCCATGGTTG 
GAOTOATAGG 
ACTACAAA06 
AGGGCGTC^rQ 
OCAAAGTGCA 
GTCGOGGACA 
ATTCCCTGGG 
ACAAGAGGT^ 
GCAGCAQCAO 
AGCCGCTOGA 
ACAOC^AAGG 
GGAAGGCCAC 
CCTAOCXS33T 
TGTCCAGITC 
AAGTCGOCTC 
AGTTCACGGG 
ACCCACOCCA 
TGGTGGCAGG 
GOSAGGCCTG 
CSVrCAAGGOC 
TT 



Se<i n> NO: B3B Protein aeguen<3b 
Protein Acce^aioa f|s NP_003457 



1 11 21 

I I ] 

MPHHfiIS£GH GOLKQLGGAF VHGRPLPEW 
ILGRYYETGS IRPGVTGGSK PKVATPKWB 
TVPSVGSIHa IIRTKVQQIFF HIiFllDgCVAT 
SlKEGIiIdOIAQ PGSDKRKM[3D SDQDSCRL8I 
ERQ3YPEAYA SPCHTKGBaa IiYFLPIiIiNST tlODGH^AIiVHB 
FH9PFA1KQB TPEVSSSSST PSSLSSSAFI* I3LQQ|VG5GVP 
LSGREHySPT LPGypPBlPT &GQGS3fASSA lAGHVAGBBY 
SfSSLI^FYy YSSTSRF9AP PTTATABDRb 



31 
I 

BiQRTVDliAHO 
KIGDYKRQNP 
KSIiSPGHTIiI 



41 
I 

GVRPCDIGRQ 
THFAWBIBDR 
PS&AWPPES 
KHU^TOAPSQ 
StTTPLGRNLS 
PFKAFPHAA3 
S6IiIAY(aiTPY 



51 
I 

LKVfiHGCV&K 
ULtAEBTCnHD 
SQBDBLGSry 
HHIiBPIiBOPF 
THQTYPWAD 
VYQQFTOOAL 
SSYSEAHRFP 



120 
180 
240 
300 
360 



60 
120 
IQO 
240 
300 
360 
420 
4dD 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
lOBO 
1140 
1200 
12GD 
1320 



60 
120 
180 
240 
300 
3€0 
420 



Seq ID BO: 539 DMA sequence 
Nucleic Acid Acceesion #8 iaHJ>D6799 
DU Coding sequencer X9*.963 



65 



70 



75 



80 



1 
I 

GOOSOGGCSAG 



tggcagqgqa 

CGCTGGGCAjC 
GGOTOQAXGG 

TACiAcannr 

CCCCATGACA 
CGCATCIGTC 
GaCTGGGGGT 
CAGGTCX30CA 
AAGOACAICr 
TTGSGrauCr 
GTCQTGftGCT 
AGCCACCACT 



TGAGCCOUCC 
TQTCTTQTTT 



11 
1 

AGGAGGCCAO: 
TCM36AAGOC 
TCACGICGOS 
GCCTGCGCCT 
TCACGGGGGC 
TCCA6TTTGG 
GTTACTTOQT 
TTGGCTTGCSr 
^CCaOGCCTC 
ACATCAAAlGA 
TCATAAACAA 
TrOGAGACAT 



21 

I 

GGGOGG60GC 
OGAOTOGCAO 
CATGGTGGGX 
GTGGGWTCC 
GC3iCTGCTTT 
CCASCTGACT 
ATCmATATC 
GAAGCICTCI 
CACATTTGAG 
GGATGAiGGCA 
CTCTATQTGC 
GGTTTGTGCT 



GGGO^GuTOGG CTGTGCTCGG 
TTGAGrGGAT CCAGAAGCTG 
CG^EACTCTT TTTCOCTCTT 
TGAGOCCATG CAGCCTGOGQ 
GQTAAZAAAC ACATTCXZAiCT 



31 
I 

GGGGCGCIGG 
GAOGOGOGQC 
G6AGA6GACG 
CAOJTATGQG 
GAAACCTATA 
TCCATGCCAT 
TATCEGAGCC 
GCACCTGTCA 
TTTGAGAACX! 
CTGCCATCTC 
AACXZACCTCT 
GGCAATGCCC 

aacaagaatg 
cocaatgggc 
atggcgcaga 

ccacxgccaa 
tgatgccttg 



41 

I 

tgctgqogct 

GQXXftTC3UaQ 



GAGTGAGCCr 
GTGAGCTTAG 
CCTTCTGGAG 
CTG6CXACCX 
GCTACACXAA 
GGACAGACTG 

cocACAcs:cr 

TCCTCAAGTA 

GACTGTGGTA 
OGGQIGTCTA 
GTGGCATGTC 
TCCXZRCTCCT 
GTCAGGOCCT 
Ca^GGGCATTC 



51 

GCTGCIGGCT 
ACCA7!GGGGC 
OGGTTGGCOG 

tgatoccicc 
CCTGCAGGCC 
GGGGAATTCA 
ACACATCCAG 
CTGGGTGACT 
GGAGGAAUiTi' 
CAGTTTOGGC 
QGATQCCTGC 
TCAGATTGGA 
CAOCAATATC 
OCAGCCAGAC 

GGTTCTCTTC 
TTCAAAA 



€0 
120 
XBO 
240 
300 
360 
420 
480 
540 
60O 
660 
720 
780 
B40 
900 
960 
1020 



6eq ID voi 540 Protein aequence 
Protein Acceasion. tts np 006790 



1161 



wo 03/042661 PCT/US02/36810 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 



70 



75 



80 



1 XI 21 31 41 51 

I I I I t I 

HGARGAIrLIA IiIiLARAOIjIUC PESQEAAPI^ GPCGRKVITS KIVGGEaDAEIi QRHFWQGSIrR BO 

LHDSHVQGVS LLSHRWAI.TA AHGF£T?6DI» 8DP&6HMVQF GQI<T&KPSF» SUOfiXYTRYP 120 

VSHZYIiSFRY liGNSPYDIAb VKLSAPVTYT KHIQPICLQA STFEFENRTD CWVTGWQXIK 180 

EDEALPSPHT liQEVQVAIilJ NSMCHHLFtiSC YSFRKDIPGD MVC3WQKAQGG KHACPGDSGQ 240 

PIACHKNGLH YQIGWSHGV GOGRSNRFGV YTHISHHFEW IQia:.M2VQSGM SQPOPS«SLIj 300 
FFPl£HALPb LGPV 

Geq ID STDs 541 DOHA Bequence 
igucleic Asld Accession #: BH_014344 
Cbdisg sequence : 13 l > - 1444 

1 11 21 31 41 51 

I I ] I I I 

tsaSOCCGCGA TGOGQCCOAA GCQCCCHAAO CCCCBGAGCC CACAAACTGC CGGGOCOGCC 6Q 

TCGCOGCCGG GACCXSGGTG CCTGGGCTCG GCTTGAAGCX3 GCCGCGOCQC ACCOGCauzaG 120 

CCGOOGQAGC ATOGQCAaOA QQATGCGGGG CGeOGCCGCC ACCXSCQOGGC TCTGGCTGCT 190 

GGCGCTGGGC TCGCTGCTGG CGCXGTGGGG AGGGCTCCTG CCGCCGCX3GA CCQAgCTOCC 240 

C9CX:TCC09(3 CCGCCC6AAO ACOOACTOCC ACGOGGCCCQ GCTOQGAQCG 60GGCCCGGC 300 

fiCCOGCGCCr CGCTTCCCXC TGGCOCGGCC CCTGGCGTGG GJVOSOCCGCXJ GCHOCTCCCrr 360 

GAAAACTTTC OBOGCaCTOC TCACCCTGGC QQCCGQCGCG fiRCXSGraCCGC CCCGGCRGTC 420 

CCOGAGOGAG CCCAGGTGGC AOGTGTCAGC CAOGCftGCCC CC3GC03QAGG AQAOOGCOQC 480 

QGTGCACGGG (SOCGTCTTCT OOAGCOQCGe CGTGGAGGAG CAGGTGCOOC CGGGCITTTC 540 

OGAGGCCXIAG GOGGCGGOQT GGCTGGAGGC GGCTQQCQGC aCCCQOATGO TGGCOCOXSGA 600 

GOGOGOOGGr TGCGGGCQCA GCTCCMiCCG ACTGGOCXX3T TCTGCGGAOG GCACCCGC3QC 6^0 

CTQCOraCQC TACGGCATCA ACCCGQAGCA OATTCAGOQC GRGGCCCIOT GTTACTATCT 720 

GGOSOGCCT6 CTGGGCCTCC AGOGCCACGT GC0GCX3QCITG GCACTGGCTC GG^TiaaAQGC 780 

TOQOQGCQcg CAOTtSOGOK: AGGTOCASQA QOASCTGCSGC GCXGOQCACT GGAOCSAGG6 640 

CAGOGTGGTG AGCCTGACAC GCTGGCTGOC CAACCTCAOG GACSSTOgroa TaccOQC36CC 900 

CTGGCaCrrcO OAGGACOGCC OTCTGCQCCC CXITCGGGGAI: GCOSGGGGTG AGCTGGCCAA 9€0 

CCTCAGCCAG GGGGAGCTGG OXSGACCTAGT ACAATGGAOC GACTTAATOC TTXTOSACTA 1020 

0CTGAO3GCC TU^CTTOtJACC GGCTOaTAAa CAACCTCTTC ASCCTGC2«3T GGGACCOGOG 1080 

CCTCATGCaG OGIGCCACCA GCAAOCTGCA OCGOGGTCCG GGCGGGGOQC TGOTCTTTCT 1140 

CJGACAATOAa QCOQGCTTGG TGCAOGGCTA CCaOGrAeCA GGCATGTC3GG ACAAGTATAA 1200 

CGAGCOGCTG TTGCAGTCAG TGTGOGTGTT COGOGAGCGG ACCGCGCGGC GOOTOCTOGA 1260 

GCrOCACOGC GGACAGGACG Ca3C33acC03 QCIQCTGCGC CICtACQQGC GCCAOGftGCC 1320 

TGQCTTCCCX: GAfiCTGGCOG CGCTTGCAGA CCCOCACGCT CAGCTGCTAC AQOGCCHCCT 1380 

OQACTTCCTC GaC3«GCAC3V TTTTGCACrrG TAAGGOCAAG TACGGCOGCC GGTCTGGGRC 1440 

TTAGTGICAC OGGGAGGAAA AGAGAGAGAT CIOGGGCTG6 GGTATGQATG ATGGGGQOAA 1500 

GGOCSSTCQC CTCTGOCACT GTCTVOGOACC AGCXDGGCCAA CGCCCACOCG CSVAAGGTGXC 1560 

TAAAAACTTC AGCTTTTCAC CCACCTGCCC CTTTCTTTCA ATCCCAOGCT GTTTOCTTTC 1«20 

AAAGTTCTGG QRGOACXSAAC TCACCaAOGC GAGAAQTQTA ACaTTCTCTC CACCCASCTT 1680 

ATAAA&GGAT TClTTACTGT GCXZAGCACGG QGATTGGATC CGAAGAAACT GGCTACT6GG 1740 

GTTTGGCCCC OGAOTGGCCG TCCCTGTGGO AGATGCACCC CATTCTTGOG CCCCCCICAT 1800 

TCCCTTTOCSS AAAAAGGAAA ACTIGOSXTT GAGOCGTTGA GCTAATTCTG CAATTTTCTA 1860 

CC&AACAQAO CX3CTGGTG3C CXXX3GAGCAG GGCTGTGACA TTGGCTGGXa GfifiCXXCTTC 1920 

CTGTGTTCTC OCTTTGTTCC AGOGCX2GOGA TGGTGAGATC ACTGTTCCAA GCAGGGGOAC 1980 

GGCTCGCSGAT AGQACAAAGA GAGCAQGACC TCCAQACTCr GGGGAGCCCT GCAGACCTTG 2040 

AGAATTTGCC TGACTCATTC CTG7W0CICTT GTCATTTTGG CCTGAAGGCT ACAAATTCAS 2100 

GGTCRfiCTQT ATGCACTAAQ TCAAATAATG AATTTCTfCC TOCCTCTCGC AAOCGACCAA 2160 

AATTTTGACA AOGATGATGT TCACCAGAA6 GAAAAAAAAA TCAGTTTTAT GCACTTTATT 2220 

ITGmMtlAT TTTC2\3TTTT TATTAAGAAA AAAl^TATT T^TACAGAATT TAOCITCTCT 2280 

GTATATATGT GCATAAAGTG TGGTGTAAAT ATACTAAACA AACTTAlfAXT I(3UVI%AA2US 2340 
G6AG*rTTAAA ATTTAAAAAA AAAAAAA 

Seq ID KiOr 542 Protein sftquence 
Protein Accession #s NP_0553.59 



1 11 21 31 41 SI 

I t 1 I I I 

MQBRKRiGAAA TAaLWiZJUja SLIiALHGGI>L PPR-TELPASR PPEDRIfKRP AR8GGPAPAP GO 

RFFZiPPPLAW DARGGBLiKTF RALLTXAA0A PGPPRQSBSB VBSmWOtQ^ XPBBGAAVHG 120 

GVFnfiSGLEB QVPPGFfi&AQ AAAHLBAAEUS ARMVAIiERGG OSRSSNRIAR FADGTBACVR IBD 

YQIHPEQIQG EALSYYXiAHL LGLQRHVPFI. ALARVEaSfiA OfWAQVQEBLH AAHHTBGfiW 240 

SLTRWLPNLT EfVWPAPWRS BDGRLRPIiBD AGGEIiANLSQ AELVDLVQWT DLII.FOTJ»TA 300 

NFDRlATSigiiF 9I^Q«DPRVHQ RATSMUIROP GOaiaVFLtniB AGliVHQyilVA GMNDKOIBPIi 360 

LQ6VGVFBS& TABRVZ^LKR QQDAAARIilA UCSBHEPSFP BLAALADPHA QLLQRRLDFIi 420 
MXaxmCKKK TGRRBGT 



Seq ZD MQ: 543 DKA seqfuence 

MUdelc Acid Accession #: XM__D07652.4 

Coding seqiiience: 1..1290 

1 11 21 31 41 51 

\ \ ] \ \ \ 

ATGGCOGGC7 CIQOCGCGTG GAAGC60CIC AAATCTATGC TMSOAAQGA TQATGGGCC3G 60 
CTGTTTITAA ATGACAGCAG CXaCCTTTOAC TTCTCGGATG AG60SGGGGA GGAGGGGCTT 120 
tECTCGGTTCA AQUUkCXTCQ AGTTGTGGT6 GCG3ATGAC6 GSVCCXSiMXZ CCCGOAAAQG 180 
CCTGTTAACG GGGCGCACCC GAOCCTCCBQ QCCGAOGATO ATPCCTTACT GGACCAAGAC 240 
TTACCTTTGA CCAACAGTCA 6CTGAGTTTG AAGGTQQACT CCTGTGACSIA CTGC3U5(2lVAA 300 
CAGAGACAGA TACTGAAGCA GAGAAAGOTG AAfiGGCAGGT TOACCATTGC TGCGGTTCTG 360 
TACTTOCTTT TCATGATTGG AGAACTT6TA G6TGSATACA TTGCAAATAS OCTAGCAATC 420 

1162 



wo 03/042661 



PCT/US02/36810 



10 



15 



IITGTGGCTAT 
GTTTTGTCS^G 
GAAGCTQTGC 
ACCGCAGCI6 

GAJVC&TAAGC 
OA.TTTGGTAC 

TTTCGAATCA 
GTAGAjCTATA 
AATATCTGCT 
GC3JUU3TTCAT 
TTTGGCATGT 
TGaXSCAAATT 



CftCTTCATAT 
CATCAAAATC 
CTATGRTTAG 
AAAGAACTAT 
TTG6A6IT<3C 
CCCATTC3GCA 
ATGG5CA66A 
AGAGTGTTGG 
CTGATCCCAT 
TATGGGATAC 
TCAAAQAAGC 
CTCTCACTTC 
CTAAATOGGA 
ATAGAT6TAC 
GTCnOAOTTC 



GTIAACTGAC 
ACCAACCAAA 
TGTGCTGTTa 
CCATATGAAC 
A6TTAATGTA 



TAGCCTGGCA 
TGTQCTAATA 
CTGTACATAC 
AC3TAOTTATA 
CTTGATGAAA 
AGGAAAATCr 
GGAAGTACAG 
TATTCAGCTT 
TAGTCCCTGA 



CTAAGOSCCA 
AGATTCACCT 
GTGTATATAC 
TATGAAATAA 
ATAATGGOGr 
TCAAATTCCC 
aTQAGAGCTG 
GerGCATACA 
GTATTTTCAT 
ATACTAGAAG 
ATAGAASATG 
ACTGCClkTAe 
TCCftAAGCAA 
CAGAQTTACA 



TCATACTCftC 
TTGGATTTCA 
TTATGGGATT 
ATGOAQATAT 
TTCTGTTGAiV 
CZTAOdUSAlGG 
CATTTGTACA 
TCATACGATT 
TACITGTSSC 
6TGTGCCAA3 
TATATTCIUaT 
TTCACATACA 
ACCATTTATT 
GGChAGAAGT 



ccrGcmacr 

TOQCTTAGAG 

ccrcrrrATAT 

AATGCTCATC 
CO^GTCIGGT 
TTCTGGGTGT 



CAAGCCAGAA 
TTTTACAACA 
CCATTTOAAT 
CGftAGATTTA 
GCXAATTCCT 
ATCGAACACA 
GGftCAQftACT 



480 

54 D 
600 
660 
720 
-780 
840 
900 
960 
1020 
1080 
1140 
1200 
X2€0 



Seq ID MO I 544 Protein sequence 
Protein Accession #s XP_007652*1 
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1 11 

I 1 
MA68GAWKRL KSM&RKDDAP 
PVNOaHPTLQ ADD(D£liLDQD 

VLSAMISVUb VYIIMCSFLLY 



YKIADPICTY VFSLLVAFTT 
HXHSirrSGKS TAIVHXQI.IP 
CIUIOQ88SP 



21 
I 

IiPLTNSQLSD 
KTDAI^ID^TD 

EBNHGQDSLA 
FRIIWDTWI 



31 

i 



KVD8C33NCSK 
I/aAlXtilLIiA 
YKINGDIMLl 
VRAATVRALG 

8K2U9KLLLNT 



41 
I 

QBEILIQQIRKV 
LHZ>6SK8E>TK 
TAAVl?VAVNV 
SLVQBVGVEiX 
-VBYIICBUMK 
FGHYRCTIQZi 



51 
1 

ADDGSKAPZR 
KKRLTIAAVL 
RFTFGFERIiS 
IHSFIJjNQSG 
AAYllRFKFE 
lEDVYSVEDIi 



11 
1 



21 
I 



31 
1 



51 



60 
120 
IBO 
240 
300 
360 
420 



Seg ID WOs 545 DNA sequence 
imdelc Acad Accession 0: AB037765,1 
Coding sequence: 1.-2478 

1 11 21 31 41 51 

I I I 1 1 I 

AT6TTTTCOG GCTTCAATGT CTTTAGAGTT (iOQATCTCTT ^^IGTCATJUVT GTGCATTrTT 
TACATOCCAA CAGTAAACTC TTT^ACCAGAA CTGAGTCCTC agaraiattt taGtacattg 
CAAOCAOGAA AAGCCTCTTT AGCTTATTXT TQTCAAGCTG ATTCCCCAAG AACATCTOTA 
TTTCITORAa AACTGAATGA GGClGa:TAGA CCICTGCZUSG ACTATGOAAT TTCA6TXGGC 
AAG8TXAATX GXGTCaAaGA ABAAATATCA AQATSaUZTBlG GAAAAG&AAX tiUATl'llWrO 
AAAGCATATT TXTTCAAGGG CAACIVTATTG CTCZ^GAGAAT TCCCIACTGA CACCTTGTTT 
GATGTGAATG OCATT6TCGC OCATGTTCTC TTTQCECTTC TTTTTAGTGA ASTGAAATAT 
ATTACCAACC TGORAQACCT TCavGAACATA <3A3^TGCTC TGftAAGGAAA AtSCRAATATT 
ATATTCTCAl' A'XGTAAGAGC CATTOGAATA OCAGAGCAC». GAOCAOTCAtT GGAAGC08CT 
TTTGTGTAIG GGACTACSVTA OCAATTTGTC TTAftCCACS^ AAATTGOGCI TCTGOAAAfiT 
AXTGGCTCTG AOGATGTGGA ATATGCACAT CTCTACTTTT TTCATTGTAA ACTACnCTTS 
GATITGACCC AGCAATGTAG AAGAAGACTA ATGGAACAOC CATTGACTAC ACTGAA£!ATT 
CACCTGTTTA TTAAGACAAT GAAAGCACCT CTQTTGACTG AAGTTGCIGA AGATCCTCAA 
C3UW3TTTCAA CTQTCCATCT CCAACIGGGC IXACCACTGG TTTTTASTOT TAaGC3\AlCAG 
GCTftCTTATG AAGCIGATAlS AAGAACK3CA GAAT6GGTTG CTT GGQBTC T TCTGGGflAAA 
GC2U3GMaTTC TACTCITGTT AAGGGAGICT TTGC3AAGTGA ACATitJCTCA At^TQCXAAT 
<?ZCGTCTTCA AAA6AGCAGA A9AGGGA0TT COVdTGGAAT T^CTXGGTATT AGATGATQTT 
GATTXAATAA TATCTCAIGT GGAAAATAAT ATGCACATTG AGGAAATACA AGAAGATGAA 
OACAASCGACA TGGftAOOTCC AGATA1AQAT GT3!CAGGArrS ATGAAGTGGC AGAAACIGTT 
TTC^ GftGaSPA GGMUSAGMUi ATTAGCTTT6 QAACFXACnG TOGAACXAAC AGAAGSVAACA 
7TTAATGCAA CAGTGATOGC TTCTGACRaC ATAGTACICT TCXKIGCTG6 TTGGCAAGCA 
GTATCJCAOrGO CATPTTTTGCA ATCCTATATT GAT3TGGCAO TTAAACTQAA AGSCACATCT 
ACTATGCTTTC TTACTAGAAT AAACTGTGCA GATTGGTCTG ATGTAIGTAC TAAGCAAAAT 
GITACTGAAT TTCCTATCAT AAAGATQTAC AAGAAAGGOG AGAACGCAGT ATCITATGCT 
GGAATQTXAB GAACCGAAGA. ^CXCCZAAAA TTTATCCRGC TCAACTlGSAT TTCRTKSCC^ 
GTGAATATAA CATOSATCCA AGAAGCSUQAA GAATATTIAA GIGGGGAATT ATATAAAGAC 
CrCATCTTGT ATTCTAGTGT GTCAGTATT6 GGACTATTTA GTCCAACCAT GZUlAACAGCA 
AAAGAAGATT TTASTGAAOC AGOAAACTAC CTAAAT^GGAT ATGTTATCAC TGOAATTTAT 
TCIGAAGAAG ATGTTTT6CT ACTGTCAACC AAATATGCTG CAASTCTTCC JUSCCCXGCTG 
CTTOOCAOAC ACACavSAAaa CMOMEKONS A8CATOCCAC XAGCIAeCAC ACATGC3U31A 
6ACATAGTTC AAATAATAAC AGATOCACTA CTGGAAATGT TTCCBGAAATT CACTGTGGAA 
AATCTTCOGA GTTATTTCAG ACTTCftGAAA CCAITATTGA TTTTGTTCAG TGATGQCACT 
^AAATCCTEC AGTATAAAAA AGCAA7ATTG ACACTGGTAA A6CAGAAATA CTIGGATICA 
TTTACTCCA7 GCIGGTTAAA TCTAAAGAAT ACTOC3U33GG GGAGAOGAAT CTTGAOGBCA 
XATTTTGATC CTCXGCX^ZCC CCTTGCTCIT CIIGTTXTGG T6AATCTOCA TXCAGGXGGG 
CAAGTATTTQ CAVTTOCTTC AGACCAGGCT ATAATTGAAO AAAACCTTGT ATTGTGGCTG 
AAGAAATTAG AA8CAGGACT AGAAAATCAT ATCACAATTT TACCTQCTCA AGAATGGAAA 
CCrCGTCTTC CAOCTTATGA TTTTCTAAGT ATGATAGATG CCGCAACATC TCAAOGTGGC 
ACTAQQAAA6 TTCCCAIUSTG ZATQAAAGAA AOUSATGTGC AGGAGAATGA TAAGGAACAA 
CATGAAGATA AATCGGCAGT CAGAAAAGAA OCGATTGAAA CTCXGAGAAT AAAGCATTGG 
AATASAAGTA AXTGGXTTAA AGAAQCAGAA AAATCATTTA GA03TQATAA AOAGTXAGGA 
TGCTCAAAAO TQAACTAA 

Seq XD MOs 546 Protein sequence 
Protein Acceeslon tts BAA92582.1 



60 
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MF5GFMVFRV GISFVIMCXF YMPTV39ISLPE I.8PQK:yFSTI. QFGKASLATF CQAD3PRTSV 60 

FLBEEiHEAVR PLQDYGISVA. IIYCGKSKQI.M JCAYIiFKGEflZ* LRBP&TDTI*F 120 

DVNAIVAHVI. FALLFSEVKff ITMliSDdiQm EKALKSKMlI IFSYVRAIGX PEHRAVHEAA 180 

FVYGTTYQFV X,TTEIAIJ-5S IGSEDVBYAH BYPFHCKLVL DLTQQCRRTI. HBQPI*TTIilII 240 

HltFIKTMKAP LLTEVABDPQ QVSTVHLQIjG IjPI>VFIVSQQ ATYEADRRTA BHVAHRLLGK 300 

AGVliLIiXJmS LEVNZPQDABI WFKRAEB6V PV&PI.VUIDV TSLTJBBVBSEt HKI^ZQEDB 3£0 

DHDKEGPDIS VQOO&mBTV EKDSJCRKUPIt SbTVELTEBT fU&TVNASDS IVXiFYAGWQA. 420 

VSHAFIiQSYI DVAVKLKGTS rTMLLTRIMCA DHSDVCTRQN VTEFPIIFMY KKGKNPVBYA 480 

Gt4LGT£DIiIJC FIQLNRISYP VHITSIQEKE EYL9GELYKD lilliYSSVSVl) GLFSPTMKTA 540 

KEDFSBA6NY LKGYVIXGIY KYAASIiPMiL LARHTE6KIB SIPIASTHftQ €00 

DIVQiXTDAZi ISHFPBZTVB NIiFSYFRLQK PLtilUFSDGT VHPOYKKAIZ. TLVHaKYUlS 6fiD 

FTPCHLHLKH TPVGRGIIiRA YFDPI.PPLFL. lATLVKLHSGG QVFAPPSDQA. I ISESIL VLWI/ 720 

RKLEASLBIH ITItf AOBG9K PPIiPAYDFLS HIDAATSQSG TRKVPKCHKE TDVQEBDKBQ 7Q0 
BEDXSnVRKE P1ETLRXKH19 XIRSNKFICEA& KSFRBDKBL6 CGKVH 

Sell 190s 547 DHA sequence 
snicleJLc Acid Accession #= 1IM_033 102.1 
Oodlr^ sequence: l..l£62 

1 11 21 31 41 51 

1 1 1 't I 1 

ATG6TCCZU3A. OGCT6TGGGZ 6ASCCGCCTG ClGOSGCACC GGAAAQOCCA GCTCITGCTG 60 

GTCAACCTGC TAACCTTTGG CCTQGAGGTG TOTTTQC3COG CW3GCft,TCAC CTATGTOCCQ 120 

CCTCTGCTGC TGGAAGTGGG GGTAGAGG2U3 AAGTTCATGA OCA.TGGTGCT GGGCATTQGT ISO 

OOUSTGCTQG faCCTaOTCTQ TGTCCOQCTC CTAGQCTC3b3 CCASIGACCH daSCQTOGA 240 

GGCTKIGGCSC GCOGCGGGCC CTTCATCTGS GCACTGTCXIT TOGGCATCCT 6CIGAGGCTC 3Q0 

TTTCTCATCC CAAGGGCOOS CTGGCTAOCA GoaCTOCTaT GCOCGQATCSC CaUSGCCCClG 360 

OAGCTOGCAC TGCTCATCCT GGGCGTGGGG CTGCTGGACT TCTGTGGCCA GOTGTGCTTC 420 

ACTCCACTGG AGGOCCTGCX CTCTGACCTC TTCCS66ACC CG6ACCACX6 TCGCXSVQGOC 480 

TACCCXOTCT ATQCCTTCAT QATCAQTCTT GGGOGCTGCC TGGGCTAOCT CCTGCCTGCC 540 

ATTGftCTGGG ACAGCM3TGC OCXGGCXXXX: TACCXG06CA CCCAGGAGGA GTGOCTCrET 600 
GGCCTGCrCA CCCTCATCTT CCTCACXTTGC (^TAGCAGCCA CACTGCTGGT GGCTtSASGAia fifiO 

GCAGCGCTGG GCCCCACCGA GOCRGCAGRA GGGC^GTCXSe CCCOCTCCIT GTC3GOCCCAC 720 

TQCTQTCCAT QC03C3GCOC3G CITGGCTXTC OGGAACCTOG GOGCCCTGCT TCCCCGGCTG 780 
CAOCAGCTST GCTGOCXSCAT GGOOCGCACC CTGGGCCGGC TClOrGGrOGC lIGAGCrGTGC B40 
ASCTGGAiaG CACTCATGAC CTTCACQCTG TTTXACACQG AT7T0STG66 OGnGGGQCTG 900 
TAlOCAGGGOG TGCGCAGAGC TGAGCOGGGC AOOSAGGGOC GGAGAGACIA TGAXGAAGGG 960 

OTTOSGATGG GCSMaCCTCGG GCTGTTOCXX3 CAGTGCGC?CA TCTOCCTGQT CTTCTCTCTO 1020 

GTCATGGACC GGCTGGTGCA GOSATTCQGC ACTCGAGCAG rcXATTTGGC CftSTGlGGCA 1080 

GCTTTCCCT6 TQGCTQCOSG TGOCACATGC CTGTCCCACA GTGTGGCCGT GOTGACaGCT 1X40 

Tcaaocjgc cc TChcCGGorsr Oiccrscia^ Gccacaaa^ tc cjg oqcta cacactggoc 1200 

TCX3CTCTACC ACOGGGAGAA GCAG6TSTTC CTGCCCAAAT ACOGAGGGGA CACTGGAGGT 1360 

GCTWSCRSXG AGGACAGCCT GATGACCAGC TTCCTGCxavG GCCCTAAGOC TGGAGCTCCC 1320 

TTCCXrTAATG GACACGTGGG TGCTGGAGGC AGTGGCCTGC TCOCADCTCC ACCXX3CGCTC 1380 

TGOSGGGCCT CTGCCTQTQA TOTCTCOOrA C3GTQTGG7GG TGGQXaAGCC CACGGAOGCC 1440 

AGGGTOGrrC CGGGCGGGG6 CATCTGGCX6 S^OCTGGCCA TCCT6GM»G TGCCITCCTG ISOO 

CTCTCCCAGG TGGOCCCATC CXZIGTTTKFG QBCrcaOTQ TOCAGCTCAG CQ^GTCTGZC 1560 

ACZQGCTATA TGGTGTCTGC CQCAGGCCXG OGTCTGGTGG CCATTTACIT TGCTACAC2US 1620 
GTAGTATTTG ACAAfiAGOGft CTTGGCC3AA TACTCAGCGX GA 

Seq ID KOt 548 Protein aequcnoe 
Protein AcceBBloEL fti ]I7_X49093.1 



-^1 11 21 31 41 51 

55 I I I I [ I 

MVQRLBfVSCtli LKHRKAQZiLL VMUAXOLEV CIAAOITYVP PUiISVaVEB KFHTKVLGIG 60 

PVl,GCVCVPEi XjGSASDHWRG RYGRRRPPXH ALGLGZLX18L FLIPSAGHLA GLLCPDPKPXi 120 

ELAZiLILGVG LISFGGQVCF TPI.EAU>SDE> FJQOPDHCRQA YSVYAFKtSI* OGCtiQyi^PA IBO 

IDHDTSAIiAp YI^GXQEBdiF GLLTIiXFLTC VAATIJ.VAEB AALGPTEFAB GL5AFSI»SPH 240 

OCPCRARLAF SIILGM.I.PSI1 BQLOCSiMPRT IiRRI#FVAELC fiHMAUSTFTI. FYTDFVGBGL 300 

VQGVeSABPG TBABBKXSBa VIWGSZ*G&FIi QCAISLVFSb VHORIiVG^EFG TSAVYLASVA 360 

AFFVAAGWrC LSaSVAWTA 8AA2>TGrrF8 AliQXXiCVTLA &I.YHfiBHQFtfF WSXttGEftOS 420 

ASSBDSIMTS FLPGPKPGaP FPNOHVaAGG SCJUUPPPPAIi GaASACX3fVSV RVWGEPTEA 480 

EWPGROICIi PLAIEiDSAFIi LSQVAPSLEH G8IVQL8Q6V TAXHVfEAAGIr GLVRZZFATQ 540 
WniKSIILAK VGA 



8eg ID ISQx S49 DMA Begoenge 

Nucleic Acid Accessloa #: Bos seqiieace 

rinflfti g eequences 1..1389 

1 11 21 31 41 51 

1 i I I I I 

ATQGGCTACC AQAGGCAGGA GCCTGTCATC C060CGCAGA GAGATTTAGA TGACAGAGAA 60 

ACOCTTGTTT CTGAACATGA GTATAAAGAG AAAACCTGTC AGTCTGCIGC TCIITTTAAT 120 

GTTBTCAACr CQATTATAGG ATCTGGTATA ATAGGATTGC CTTATTCAAT GAAGCAAGCT ISO 

GGGnrTOCIT TOGGMUCAUX GCITTTAXTC HGGGTrTCAT ATGttACXXSA CimCCCIX 240 

CTTTTATTGA TAAAAGGAOB GGOOCTCTCT GGAACAGATA CCTACCAGTC TTIGGTCAAT 300 

AAAACTTTOG GCTTTOCAGG GXATCTGCTC CTCTCTGTTC TTCRGTrTTT GTATCCTTTT 360 

ATAGCAATGA TAAGTTACAA TATAATAGKTT GGAGATACTT TGAGCAAAGT TITTCAAAGA 420 

ATGCCAGGAG TTGATCCTGA AAAOGTGTTT ATTGGTOSCC ACTTCATTAT TGGACTTTOC 4 BO 

ACAGlTAOCr TCACXCIGOC TTTATCCT3G TAOC6AAATA TAGCAAAOCT TGQAAAGGTC 540 

TCOCTCATCr CXACaGGTTT AACAACTCIG ATTCTTGSAA TTGCAATGGC AAGGGCAATT 600 

TCACIGGGTC CACACATAOC AAAAAC3U3AA GAOSCITGGG TATTtGCAAA QCOCSASGCC 660 

ATTCA2U80aO TQGGGGTTAT GTCITTTGCA TTXATTTGCC AOCATAACTC CTTCTTAGTT 720 
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wo 03/D42661 



PCTAJS02/36810 



10 

15 

20 

25 

30 

35 

40 

45 
50 
55 
60 
65 
70 
75 
80 



TA£3VGTTCrC 
GTGATTTCTG 

AGATXTTGTT 
GAGQTAA.TTG 
ACAGTGATGG 
GTTCTAQAAC 
TGTTATCTGA 
ATGCTTCCCA 

AATACCTCM3 
TTTCAATOA 



TAGAAGAACC 
TATTTATCTG 
GGQACTTATT 
ATGGTGTCAC 
CCAATGTGTT 
TCATCRCTOT 
TC34AT6GTGT 
AACrOTCTGA 
TTGGTGCTGr 
CCCATGGGCA 
AGTCTCATGT 



C2VCAGTAQCT 
TATATTCTTT 
TGAAAATTAC 
TGTCATTTTG 
TTTTGGrTGGG 
AGCCnCSGCTT 
GCTCTGT6CA 
AQAACCAAG6 
GGXGAXGGTT 
GGAAATC3TTC 
TCAGCAGACA 



AA&TGGTCCC 
QCTACATCT6 
TGCAGAAATG 
ACATACCCTA 
AATCTTTC3W 
GT6TCATT0C 
ACTCCCCTCA 
ACACACTCC3G 
TTTCK5ATTCQ 
TACTGCXTTC 
ACACAACITT 



GC!CTTATCGA 
GATACTTQAC 

ATGACCIGGT 
TGGAATQCTTT 
CGGTITTCCA 
TGATTC3ATT6 
TTTTTATCAT 
ATAAGATTAT 
TCATGGCTAT 
CTGACAATTT 
CTACTTTAAA 



TATGrrCCATC 
ATTTACTGGC 
AACATTTeOA 
TGTSACAAGA 
CATTGTTQTA 
CCTGGGGATA 
TCCATCAGCC 

TACAAATACX 
CTCTCTCACA 
TATTAGTATC 



Seq ID UOt 550 Protein oequence 
Protein Accession #: Eos sequence 



1 
I 

MGYQSQEPVI 

lAMISXHIIA 
SDISTQUTTIa 
YSSliEEPTVA 
RFC«3VTV31i 
VIiBLNGVLCA 



11 

1 

PPQRDLDDRE 

GDTL&XVFQfi. 
ILGIVMARAI 
KMSRLIHM8I 
TXI>HECPVTR 
TFLIFIIPS/^ 
yCFFtlNFSZiT 



21 
1 

TLVSEHEYKE 
VLLIKG(3AIiS 
IPGVDPBNVF 
SLBPHXFICTB 
VISVPICIPP 
-EVTJJSrVFFGG 
CYLKLSEEPR 
STSBSHVQQX 



31 
1 

KTCQSRAIJPK 
GTDTYQSIjVN 
IGRHFIIGLS 
lUWVFAKFllA 
ATCC3Y1jTPTG 
NLSSVFHIVV 
THSXliaKSCV 
TQLGKTLNISI 



41 
I 

WNSIIQSQI 
ICTFOFPGieUj 
TVTFTIjPIjSIi 
1QAV6VHSEA 
FTQGDIiPQVY 
TVHVITVATIi 
tCLPXOAWHV 
FQ 



51 
i 

IGLPYSHKQA 
L&VLQFIiYPF 
YSNIA3ELOKy 
FICEHNSFLV 
CHNDDLVTPG 
VSLLEDCLGI 
FGFVMAIOXIT 



Seq ID NO: 551 DNA cequence 
Nucl-elc Acid Accession #i Ed 
Coding sequence : 1 . . 12 B4 



sequence 



1 
I 

ATGGGCTAOC 
AAQjCZAAGCTS 



TTOaTCAATA 
TATOCTTTTA 
ITTCAAAjC?AA 
GGACTTIGCA 
GOAAAQRSTCr 
AGOGCAATTT 
CCCAATGCCA 
TTCrTAGTTT 
ATGTOCATOG 
ZTTACltaGCl 
aCATTTGGAA 
GTGACAAGAG 
ATTGTTCJZAA 
CIOGGGATAG 

caucCAGCscr 

TCTTGT6TCA 
ACAAATACTC 
TCTCTCACAA 
A!I!TACrTATCr 



11 

I 

A6AGGCAOGA 
GGTTTCCTTT 
TTTTATTGAT 
AAACTTTCOa 
TAGCAATGAT 
TCCCAGGAOT 
CAGTTAGCTT 



21 
I 

GOCTGTCATC 
GGQAATATTG 
AAAAGQAGGG 



dkCTGGGTCC 
ttcaagogqx 
ACAGTTTCTCT 
TGATTTCTGT 
TCAOOCftAlSe 
GATTTTOTTA 
AGGTAATTGC 

rrCTAGAACT 
9TTATCIQAA 
TGCTTCCCAT 
AAOACrOCAC 
ATACCTCAGA 
TTCAACTCGA 



AACTTACAAT 
TQATCCTGAA 
TACTCTGCCT 
TACAQCTTTTA 
ACACATACCA 
0GGC3QTTATQ 
AOAAGAACXX: 
ATTTATCTGT 

TGCTGTCACT 
CAATGT6TTT 
CATCACTQTA 
CAATGGTGT6 
ACTQTCTQAA 
7GQTGCTGT6 
CCAlX^dOCAQ 
GTCTCATGTT 
OTAA 



31 
I 

OCX3CCGCAGA 
CTTTTATTCT 

TATClGCrrCC 
ATAATACCTG 
AAOGTGTTTA 
TTATCCITGT 



AAAACAGAAG 
TCTTTTGCAT 
ACAGTAGCXA 
ATATTCTTTO 
GAAAKniACr 
GTCATTTTC3A 
TTTGGTGGGA 
QCCACSaCTTQ 
CTCT6TGCAA 
OAACCftaOQA 
GTGATGGTIT 
GAAAlGTXCr 
CAGCMSHiCAA 



41 

I 

GAGGATTGCC 
GGGTITICATA 
GAACIU3ATAC 
TCTCTGTTCT 
GAOATACT^ 
TTGGTCSGCCA 
AJCOGAAATAT 
TTCXTGGAAT 
ACGCI^^GGGT 
TTATTTOCCA 

CTACAXGTGa 
GCAGAAATGA 
CATACOCTAT 
ATCTTTCATC 
TGTCATTGCT 
CTOOOCTCAT 
CACACI0C3SA 
TTGOATTCSQT 
ACrGCTXTOC 
O^CAACri'i'C 



Seq ID ER>s 552 Protein eequeace 
Protein Accession #s Eos 



1 
1 

VIGXQBQEFVX 
XiVMKTPGFPG 
GLBTVTFTLP 
FNAXQAWVK 
FIGFXQSDLP 



11 
I 

PFQROEiFYfiH 
YLLLSVLQFL 
t^LYRNIAKIj 
SFAFICHH17S 
JSNYCHHDDEiV 
ATZiVSEJiUX: 



SCWOIiPiaAV 
ZSIFQLB 



21 

1 

KQAGFPI<aiIi 

YPFiAMisnr 

GKVSlilSTGIi 
FLVySSIiBBP 
TPQRPCYGVT 
LGIVUSUIGV 
TNTQDCTBGQ 



31 41 

I I 
liLFHVSYVTD FSLVLURSG 
IIAGDTLSKV PQHIPGWDPB 
rm»lIiGIVMA PAISLGPHIP 
TVAKHSRIalH KSIVISVFXC 
VIIiTYPMBCF VTREVIANVP 
IiCATPIiIFII PSACarucL&B 
EMFXCFPDMF SL1MT888HV 



780 
840 
900 
9S0 
1030 
lOBO 
1140 
1200 
1260 
1320 
1380 



51 
I 

ITATTCAATQ 
TGTTACAGAC 
CTACCAtSTCr 
7CAGTTTTTG 
OAQCAAAOTT 
CTTCATTATT 
AISCAAAQCTT 
TGXftATQGCA 
AT^njCAAAG 
CCATAACTCC 
CCTTATCCAT 
ATACTTGACA 
TGACCTGGTA 
GGAATGCTTT 
GGTXTTCCAC 
GA^TTGATTGC 
TTTTATCATT 
TAJUSftSTATG 



TGACAATTTC 



SI 
1 

AL8GIDTYQ8 

JSVFIGRHFII 
KTHDAKVFAK 
IFFATGGYLT 



Seq ID nOi 553 DCJA segiieace 

Kudelc Acid Accession #i Bos sequence 

Coding sequence s 1 . . 1203 



1 
I 

ATGGGCTAOC 
AAAOGAOGGG 
TTTCCAGOGX 
AGTTACAATA 
CATCCTGAAA 
ACTCrSQCTT 
ACAGGTTTAA 
CAGATACCAA 



11 



AGAGGGAGGA 



21 

I 

GOCTGTCATC 
AACAQATACC 
CTCTGTTCTX 
AGATACTTTQ 
TGGTOGOCAC 
OOGAAATAZA 



ATCTGCTCCT 
TAATAQCZGG 
ACGTGTTTAT 
TATCCCTGTA 
CAACTCTORX 
AAACAGAAGA GGCTTGGGTA 



31 
1 

OOGCCGCAGT 
TACCAQTCTT 
CAGTTTTTGT 
AfiaiAAGTTT 
TTCATTATTG 
GCAAA6CITG 
GTAATGGCAA 
TTTGCftAAGC 



41 

I 

TGGTCAATAA 
ATCCTTTTAT 
TTCAAAGAAT 
GACTTTCCAC 
GAAAGGTCTC 
GOGCftATTTC 
CCAATGCCnr 



€0 
120 

lao 

240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
84 D 
900 
960 
1020 
10 BO 
1140 
1200 
1260 



BFaTHGDKIH 
QQ7TQILSTLN 



60 
120 
180 
240 
300 
360 
420 



51 

1 

Tl'i'ATTQATA 

AACTTTCGGC 
AGCAATGATA 
CCCAQGAGir 
AGTTACCTTT 
OCTCATCTCT 
ACTGGGTCCA 



60 
120 
IBO 
240 
300 
360 
420 
480 
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GCSGGlTTATGr 
GAAGAACCCA. 
TTrATCTGTA 
OACTTATTTG 
OGTGTCACTO 
AATGTGrTTT 
ATCSUrrOTAG 
AATGCTGTGC 
CTCJTCTGAAG 
GGTGCTGTOa 
CATGGGCAOG 

TcrrcaTGTTC 



(TTTTTGCATT 
CAGTAGCTAA 
TATTCTTTGC 
AAAATTACTG 
TCATTTTGAC 
TTGGTQGQAA 
CCACGCTTGT 
TOTQTGCAAC 
AACCAAGGAC 
TGAtTGGTTTr 
AAAT6ITCTA 
AGC3U3ACAAC 



TATTTGCCAC 
GTGQTCCCGC 
TACATGTOGA 
CAtJAAATGAT 
AT&CCCTATG 
TCTTTOVTro 
GTCATTGCTG 
TCCCCTCATT 
ACACTOCGAT 
TGOATTCQTC 
CTGCTTTCCT 
ACAACTTTCT 



CATAACTCCT 
CTTATCCATA 
TACTXGACAT 
GACCTGGTAA 
GAATGCTTTG 
GTTTTCCACA 
ATTOftTTQCC 
TTTATCATTC 
AACATTATGT 
ATOGCTATTA 
GACAATTTCT 
AdTTAAATA 



Seg ID NO: 554 Protein sequence 
firobein Accession ft^i Bqs sequence 



1 
1 

MGXQRQEFVr 
8SIZXAGDII. 
TCJLTTIiIIiCJI 
EEPTVAKWSR 
GVTVII.TYPM 
HGVZ.CATPLX 



11 
1 

PPQPSIiVWjI 
SKVFQRIEGV 
VKARAJSU9P 
LIHMSIVIfiV 
ECFVT2EVZA 
FIIPSACYIiK 



21 

i 

KGGAIiSGTDT 
D»£I9VFZ6RH 
SXPKTEDAHV 

FICIPFATC3G 

NVPFcaoaLSS 

LSEBPRTHSD 
SRVQQITTQL& 



31 
I 

YQSLVZnCTfG 
FIIGLSTVTP 
FAKFHAIQAV 
YLTFTGFTQG 
VFHIWTVMV 

TUaSIFQIiG 



TCITAGTTTA 
TGTCCATCGT 
TTACTGGCTT 
CATTTGGAAG 
TGACAAGAGA 
TTGTTGTAAC 
TCGGGATAGT 

CTTQTGTCAT 
CAAATACTCA 
CTCTCIkCAAA 
TTAGTATCTT 



41 
I 

FPGYIiUjIjSVIi 

GVHfifAFICH 
DI>FEHYCEEHD 
ITVATLVSLL 
GAWKVFQFV 



CAGTTCTCTA 
GATTTCTOTA 
CACCCAAGGG 
ATTTTGTTAT 
QGTAMTGOC 
AGTGATGGTC 
TCTAGAACTC 
TTATCTGAAA 
GCTTCCOVn 
AGACTGCACC 
TACCTCIUM 
TCAACTCGA6 



51 
1 

QFLYPFIAMI 
AKLQKVS1.IS 
SNSFLVYSSI. 

ZDCLGIVLEL 
MAITMTQDCi' 



Beq ID NO: 555 DMA gequenoe 
MUcleic Acid Accession $ t Boi 
Coding sequence: 1.. 114.0 



1 

I 

ATGGGCTAOC 
CCAGOCTTATC 
TACAATATAA 
CCTGAAAA09 
CTGCCTTTAT 
GQTTTAACAA 
ATACCAAAAA 
QTTAKTrCTT 
6AAOCCACAG 
ATCTTCTATAT 
TTATTTGAAA 
GTCACXGTCA 
GTGTXTTTTG 
ACTGTAGCCA 
GGTGTGCTCT 
TCTGAAGAAC 
GCTGTGGTGA 
GGeCM3K3AAA 
CATGTTCAGC 



11 
I 

AGAGGCZUSGA 
TQOTC3CTCTC 
TA6CTGGAGA 
T(3VTTATTB<G 
CCTTGTACOG 
CTCTGATTCr 
CAQAAGAGGC 

TAGCTftftGTG 
TCTTTGCTAC 
ATTACTGCAQ 
TTTT6ACATA 
GTGGGAATCr 
OGCTTGTGTC 
SIGCAACXCC 
CAAGQACACA 
TOGTTTTTGG 
TOTTCXftCTO 
AGACAACSUSi 



21 
1 

GOCTGTCATC 
TGTTTCXTCAQ 
TACTTTGAGC 
TCQCCACTTC 
AAATATAGCA 
TGGAATTOTA 
TTG6GTATTI 
TTGCSCRCCAT 
GTCGOGCCTT 
ATGrTOGATAC 
AAATGnrGAC 
COCTATGGAA 
TTCATCQGTT 
ATTGCTGATT 
CCTCATTITT 
CrCCJGATAAG 
ATTOGICATG 
CTTTCCTQAC 
ACTTTCTACT 



31 

I 

COGCC9GCAGG 
TTTTTQTArrC 
AAA6jra-i,-iTC 
ATTATTOGAC 
AAGCITOGAA 
ATGOCAAGQO 
6CAAAGOCCA 



AT0CATAT6T 
TTGACATTTA 
CTGGTl^ACAT 
TGCTTTGTGA 
TTCCACATTG 
GATTOCCTOG 
ATCATTOCAT 
ATTA3X3TCTT 
GCTATTACAA 
AATTTCTCTC 
TTAAATATTA 



41 
1 

TCAATAAAAC 
Ci'TlTATAGC 
AAABAATCCC 
TTTCC3\CAGI 
AQOTCIOCCT 
CRATCTCAiCT 
ATGCCATTCA 
XAGITXACAS 
GCATOGTGA.T 

ctggcttcac 

TTGGAAGATT 
CAAQAOAaOT 
TTGTAACAGT 
GQATAGTrCT 
CAGCCTGTTA 
OTQTC3\aX3CT 
ATACTCAAGA 
rrCACAAAXAC 
GTATUTTX'CA 



geg ID BO: 556 Protein aegixence 
Protiein Accession #x Eos ssqoence 



1 
I 

PEWVPIGHHF 
IPKTEDA^tVF 
IClPHVrOGY 
VFPGGOT.9SV 
SEBPRTHSDK 
3BCVQQTTQCiST 



11 
1 

PPQVNKTFGF 
IIGLSTV^FT 
AKPNAIQAVG 
LTFTGFTQ^SJ 
EHIWTVMVr 

iMSCvniiPie 



21 

I 

PGYIiLIiSVLQ 
I^SXiYSNZA 
VM8FAFICHH 
If ECnrCRNDD 
TVATLVSJ.LI 
AWHVFSPVM 



ai 

I 

FIiTfPPIAMie 
KLGIC7aZ.IST 
NSFItVYSSLE 
IiVTFGaFCYG 
PCLGXVIiEIN 
AITUTQDCTK 



41 
1 

IfNIIAtaXTLB 
GLTTLn^IV 
EPTVA^SRL 
VTVILTYPME 
QVIiCATPDXF 
GQEHPyCPPD 



Seq ID NOz 557 DHA Beqnonce 
Mudeic Acid AdcefiBxon #t XHJ) 57188. 1 
Coding eequence: 769. .4269 



I 
I 

ATGGGATTGC 
CTCOOCCAAG 
CTGCOCTPCC 
CTCCGCIGT6 
TCTGTOCCCT 
AAQACTCCCC 

GGTTrocicr 

GCAGGAGGTT 
CCCRAOATCT 
G&TGGGGTCr 
TGCCTTGACC 
OIVGCAAACIX 



540 
600 
6fiD 
720 
7B0 
840 
SOO 
SGO 
102D 
1Q80 
1140 
1200 



51 
1 

TTTCOGCTTT 
AATGATAA6T 
AGGAaTTQAT 
TACCTTTACT 
CATCIUTACA 
OGGTOCAC3U: 
AGOGGTOQOG 
TTCTCTAGAA 
TTCTOTA^TT 
CCAAGGOGAC 
TTGTTATGGT 
AATTGCCAAT 
GATGGTCATC 
ASAACrCA2VI 
TCTGAAACTG 
TOCCATTGGT 
CTGCACCCAT 



ACTCSAOTAA 



51 
] 

KVPQRIPGVD 
MASAISLGPH 
XEKSTV13VP 
CFVTSEVIAISI 
ZlPBAiCniKli 
HF8bXEITSE9 



11 
1 

CTcrcccrcT 

TCTGTGTCCC 
COCAATTCTT 
TQTCTCTCTC 
TATCGOGGCC 
GCCCCCCAGA 
GGG08GGTCX 

TCRAGAAGAA 
CGCTCIAXTG 
TCCCAAAGTS 
TCTGGGAAGT 



21 
I 

cactctgggc 

TCTCTCTOCC 
TGGTTTTTGC 
CCCCCJGOCCC 
IGGGACCOGC 
CCTOGCCCGS 
GGAAGCAGA6 
COaCAGCATG 
GRCCTGCAOG 
COCAOGCtGT 
CTTGGATTGT 



31 
I 

TTCTGTCCCA 
CTAAATCTCT 
ATCOOCCTCT 
GQACCTCTGC 
CCTCTCOCCG 
CCOCAGGCIA 
CG6GOOGAGG 
QTGGTGCCGG 
AOGTTCATAa 
TCTCAAAGTC 
GOCCAGCOGT 
GXXGGAAdSTA 



41 
I 

CTCTTATCTT 
GGCCCGTCCX 
GCCCCTTGCX; 
ACCCCCCA6G 
CCTCCCGCTT 
GGCTGGAAAG 
GAGCGCCGGG 
AGAAGGAGCA 
TTQACTCCAC 
CTGGGCTCAA 
CCTTSAAGTT 
GCTGCIUSGGG 



60 
120 
180 
240 
300 
360 



«0 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
lO&O 



60 
120 
ISO 
240 
300 
360 



51 
I 

AGTGTCAGTC 
TTCTGAGTTC 
TCAGTCAAGT 
TOGCXGTCCC 
TOaCQTCTCC 
XOGAGGATCC 
QCOCTGGGCT 
GAGCTGGATC 
AGATCCGOGG 
GCAGTCCTTC 
TTGGTCAGAA 
AACISAGGGGA 



£0 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 



1166 



wo 03/042661 



TTCAAGGATG OAGCTGAlUa GGGTGAftCGG AChAAfiTGGG TAAACT(3AAT GGAGGA.TGCC 780 

TTCX?GGGCA6 CCGTOGTG&C OSTGTGGGAC AQCGATGCAC ACACCMiraaA GAAGCCCACC 840 

G^KSCXTTACQ GAOAGCTGGA CTTCA0C5Q6G GCCGGCCGCA. AGC&CAGOVA TTTCCTCOGG 90 D 

CTCTCTGACC GAACGGATCC AGCTGC3iGTT TATAOTCTGG TCACACXJCAC ATGGGGCTTC 960 

0GTGCCCCX3R. ACCTGGTGGT GTCAOTGCTG G06G6AT0GG GG0GCCCO6T CCTGCZAGACC 1020 

TOOCTGCnSG ACCTGCTGC3G TCGT66GCXG 6TGCOGGCTG OCCAGAGCAC AGGAfSCCTQG 1000 

ATTGTCACro OQGGTCTGCA CAOGGGCATC QGCCGGCATG TTQGTGTGGC TGTACGGGAC 1X40 

CATCAGArrGG CIZAaCACTGO GGGCACCAAG GTQGTGGCCA TGGGTGTGC3C CCCCTGGGGT 1200 

GTGGTCCCCA ATAGAGACAC CCTCATCAAC CCCAAQQQCT CGTTCCCTGC GAGGTACCGG 1260 

TGGGQCGGra ACOTaOAGQA OQGGGTCCAQ TTTCCXXTCGG ACTACAACTA CTCGGCCTTC 1320 

TTCCTGGTGG A06AGGGCAC ACACG6CTGC CT6Q6GGGCG AGAACCGCIT CCGCITGGGC 1380 

ClGGraSTGCT ACATCTCACA GCAGAABAO? QaaSTGGGAG GGACIGGAAT TSACATCCCT 1440 

6TCCTGCTCC TCCTGATTGA TGGTGATGAG AAGATGTTGA O60GAATAQA QAACtSCCACX: 1500 

CAGGCTCflfiC TCCCATGTCT CCTTOTOGCT OQCTCSkflOGG GAGCTSOOGA CTGCCTGGOG IS 60 

6AGACCCT66 AAdSACACTCT GGCCOCAGGG AGTGGGGGAG GCAGGCAA0G OGAASCCOSA 1620 

GATOGAATCA GQCQTTTCTT TOCCAAAGGa QACCTTGAGG OMXnGCAGGC GCSUOSVGGAG 1680 

AGGATTATGA C006GAAG6A GCTCCT6ACA GICTATTCTT CTQAGOATGG OTCTGAGOAA 1740 

TTGGAGAOCA TAlGTTTTQAA GOCX?CTTOTO AAGGOClGXG GGAGCTOGGA GGCCTC7U3CC 180D 

TACCTGGA3X3 AGCTGCGTTT GGCTGTGGCT TGGAACCGCG TGGACATTGC CCAGAaTOAA I860 

CTClTTOOgG GQGACATCCA ATGGCGQTCC TTCCA:iCrCO AAGCTTCCCX CA!CG6AGGCC 192D 

CTGCTGAATS ACGSGOCIGA GTTOGTGGGC TTGCTCATTT CX3CACGGCCI! CAGGCXQGGC 19B0 

CACTTOCTOA CCCCGATGCG CCTGGCXXSA CTCfTACAGCG CGGC3GCCCTC CAAjCTCGCTC 2040 

ATCC5C3CAAOC TTTTQGACCA GGCGTCOCAC AGC30CAGGCA CCAAAGCCCC AGCCCTAAAA 2100 

QQGGGAGCTG CGGAGCTCGQ GCCCCCTQAC GrGGGGCAOS TCCTGAGGAT GCTGCTGGGG 2160 

AAGATGTG06 CGOOGAGGTA GGGCTC0Q3G OQCGCCXGGG ACCCTCACCC AGGCOVGGGC 2220 

ITO880GAGA GCArrGTATCT GGTCrOSGAC AAGGCCftOCI CGCX3GCTCIC GCTGG3VTGCX 2280 

QGCCTCGOOC AQQCCCCCTG OAGOGACCTG CTTCTTTGQQ CACTGTTGCT GAACAGGGCA 2340 

CAGATGQCCA TGTACTTCTrG GGAGATGGGT TCCAATGCAG TTTCCTCAOC TCrTGGOgCX: 2400 

TGTITQCTQC TCCGOQTOAT GOCACXSCCTQ QAQCCTGACG CTGAGGAGGC AGCAOSGAQG 2460 

AAAGACCIGG CGTTCAAGTT TGAGGGGATG GGGGTTGACC TCTTTC3GOGA OIGCTATCQC 2520 

ACCAGTSAGG TGAfiGGCTGC COOCXTTCCTC CTCCeTCGCT GGOGGCTCTG GGGGGATGCC 25 BO 

ACTTSCCTCC AGCTGGCCAT GCAAGCTCAC G0C06TG0CT TCTTTBCC9CA OQATOGGGTA 264D 

CAGTCTCTGC TGACACAGAA GTGGTI3QC3C1A QATATGGCCA GCACXACACC CATCTGOGCC 2700 

CTGGTTCTOS CGTTCTTTTG OCCTCCACTC ATCTACACCC GCCTCATCAC CnXZAQOAAA 2760 

TCAQAAGAGQ AGCCCACAtirG GGACK3AGCTA GAGTTTGAjCA TGGATAGTGT CATTAATGGG 2820 

GKAGGGCCT6 TG6GGACGGC GGACCCAGOC GAGAAGAOGC OGCTGOGGOT CCCGCGGCA^ 2B80 

tCCXaaOGOOTC CGQarTOCTQ OGGGOaCCOC TOCGGGQGGC QCXXSSTGCCT AG6CGQCIGG 2940 

TTOCACTTCT GGGGOGCGCC GGTGACCATC TTCATGGGCA ACGTGOTCAQ CTACCrOCTG 3000 

TTCCTGCTGC TTTTCTOGCG GOTGCrGCTC OTGOATrTCC AGCCGGCX3CC GCCCGGCTCC 3050 

CTGGflGCTGC TGCTCIATTT CIGGGCTTTC AGGCTGCTOT GCGAGGAAC7 OCQCCAGGQC 3120 

crGAGCstua acGtaGOGCAa ccroGcxnac ggggoccxicg ggccdoggcsi tsocecactg 3180 

A6CCAG0G0C TGCGCCTCTA CCIGGCCGAC AGCTGGAACC A6TG0SACCT AGTOGCaCTC 3240 

ACCTGCTTOC TCCTGGGCGT GGGCTGCCGO CTGACCCCGG GTTTGTACCA CCIGGGCOGC 3300 

ACTGTCCTCT GCATCGACTT CATGGrTTTC AOGGTGOGGC TGCTTCACAT CTTCAOGGTC 33 eo 

AAGAAACAGC TOGGGCCCAA GATOGTCATC GTGAGCAAQA TaATGAAGGA CGTaTTCTiC 3420 

TTGCTCXTCr rrCCTOSGCGT GIGQCCGGTA GGdATGGOG TGG0CA06GA GGGGCTCCI6 3480 

AGGCCA0G9GQ A«3ICACTT COCAAGTATC CTGCGCX^GOG TCTTCTACOS TGOCiaiCCTO 3540 

CM3ATCIT0G GGCAGATTCC GCA06AGGAC AIGGACGTGG OOCTCATGGA GCACAGCAAC 3^00 

TaCTCCTTCSQG AQCCCGGCTT CXOGGCACAC CCTCCTI3Gaa CCOWaGOOGG CACCTGOGTC 2€60 

TCCCAGTAIG CCAACEQGCT GGTGGTSCTG CTCCTOSTCA TCTTOCTGCT OGTGGOCAAC 3730 

ATCCTOCTQG TCAACTtOCT CATTQCCKTG TTCaUZFTACSl CSkTTGSGCAA AGXACAGGSC 3780 

AACAGGGATC TCTACTGSAA GGGGCAfiOGT TAGCGCCTCA TCOQG0AATT CCACTCTOaG 3840 

OCX3GCGCIGG CCCOGOCCTT TATCOTCATC TCCXSiCTTGC GCCTOCTGCT CAGGCaUVTTG 3900 

TGCAGGCGAC OOCGGAGOOC CCAGCOSTCC TCCCCGGCOC TCX3AGCATTT CCJQOQTTTAC 3960 

CTTTCTAAGG AAGCXiGAGCG GAAGCTGCXA AOGIGGGAAT OGGTGCATAA GGAGAACTIT 4020 

CTGCXGQCSU: GQQCXAGQGA CAAGOSGGAQ AQC3ACXC06 AGGGTCTGAA GCGCACGTOC! 4080 

CftGAAGGTGe ACTtGGCACT CBUUUMCIG OGACACATCiC GOGAGTAGGA ACAGOGOCTG 4140 

AAAGTGCTGa AGOGGGAGGT OCAGCftGTST AGCOGOOTCC TGGGQTGGCSr QGCOGAGGCC 4200 

CTGAGOCGCT CTGCCTTGCT GCOCOCAfiGT GQGCGGCCAC CCCCTGACCT GCCTGGGTCC 4260 

AAAGACTOAO CXX^TGCTGGC GGACTTCAAO GASAAQCCCX: CACAGGGQAT TTTGCXOCTA 4320 

GAGTAAGGCr CATCTG6GGC TOGGCOOOCG QMXTGGXGO OCTTGICCTX GAGGTQAOQC 4380 

CCATGICCAT CTGGGCCACr GTCSUSGACCA OCTTTGGGAG TGrCATCXni AGAAACCACA 4440 

SCATC3CCCGG CTCCTCCCAG AACCAGTCCC AGCCTGGGAG GATCAAQGOC TGGATGCOGG 4500 

GOCGTXAXCC A-rCTGGAOGC TOCAGGGTCC TTQOGGTAAC AOGGACCACA GACCCCTCAC 4560 

C!ACrCACAGA rTTCCTCACAC TGGGGAAATA AAGCC3^TT!CC AGAGGAAAAA AAAAAAAAAA 4£20 
AAAAAAAAAA AAAAAAAAAA A 

8e<x 5S8 Protein eequence 

firetein Acceesion #s ZP_057iea.l 

X 11 21 31 41 51 

I I 1 I I I 

MBDAFGAAW TVWDSDAHTT KKPTDAYGEL DPTGAGRKHS NFUHiSDRTD PAAVYSLVTH 60 

THGFRAPNIjV VSVLGG&GGP VUQTHLQDLI. RRGIiVRAAOS TGAHIVTGGL HTGIGRHVOV 120 

AVRDHQMA8T GGTKWAMGV APW6WBNRD TZtlNPKGSFP ARYRWRGDPE DOVQFPUXfN 180 

YSAFFliVDOG THGCLGGEHR FSZJOiESYia QQKIGVGGTG IDIPVLLLLI DCa3EBa4IiTRI 240 

EtIATQAQIfC IiIiVAGSGGAA VChK^hBm LAPGSOGARQ OEARDRrRRF FPKODIjEVIiQ 300 

AQVBRZMIRK GSEGFETlVIi KALVKACSGSS BASAYLDSLR LAVANNRVDI 360 

AQSBStiFRGDX QHRSFHIjEUVS IMDAUJn^RP EFVRtiLISHQ Z»SIiCSIETiTFM RI^AQIiYSAAP 42 O 

eHSIilKHUiD OASaSAGTKA PALKQGAAEIi RFPDVGHVLR MLIiGKHCAPR YFSGGAHDFH 48 0 

PGQGFGBSWY lASDKATePlj SLDAGLGQAP MSDLLIiWAlili liEJIlAQHTUiYP HEMQSHAVSS 540 

ALGACIiLLtiV MAKLEPPAEB AARRKEJLAFK FEGMGVDI.FG ECSfRSSEVRA ARLI>LRBCPI« 600 

nCDATCI^LA HQADARAFFA QDGVQSLLTQ KHWODHASTI PIWVLVLAPF CPFLIYXBLI 660 

TFRKSBBBPT KEELSFDMDS WXNGBGSVQr Al^AEKTPLG VFRQS8RPGC GGGRCSGOBRC 720 
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PCT/US02/36810 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



lAQGLSGGOG 
HLGHTVItCZID 
EQUtiRPRDSD 
GrTCVSOYAHW 
TEISRPAIiAPP 
KEtlFUtAHAR 
VASAl/SRSAIi 



PVTIPMGNW 
SLA9GGPGP6 
FMVPTVELIiH 
FPSIMIRVFY 
I,WLLI,VIPL 
FIVISHIiItLL 



liPPGGPPPPD 



SYliLFULIiFS 
KASIiSQRLlEaj 
IFTVWKQLGP 
RPYLQIFGai 
I>VAiaiIiLV19L 
LROLCRHPSS 
KRTSQKVDZA 
LPGSKD 



RVUaVDPQPA 
YlADSmiQCD 
KIVIVSKMMK 
PQEDMDVAIjM 
LIAMFSYTPG 
PQPSSPAIiBH 



PPG8LBUULY 
IiVAIiTCFI.La 
DVFFFLFFLG 
EHSHCSSEFG 
KVOGHSDIiYH 



Seq ID HOi 553 DHA seouence 

NUdelc Acid Accession #s l9HL006a59«l 

Coding Bequeace s 26 . . 874 



1 
I 

AOQAATCTOC 
ATOGGGOVSh 
CATGAGOATT 
CAGGATCATC 
OCaAGAASaCG 
AGKXXZ&CIGC 
GQAGGGCTGT 
CAGOCTCCCC 
CTCCRTCACC 
CAGCTGCXrrC 
CXTQCSATGC 
CAftCATCACA 
GGGTSftCTCC 
CCA£3QATC3C30 
OGACIGGATC 
ACOCTCCATT 
CAAGACCCTC 
AATCAACCTG 
GACTCTGGGA. 
TCCTGGCXAT 



xa 

1 

GGTCTCACAQ 
CTGCAGTTAA 
AAQGGQTTCG 
CGGCTACTCT 
CTCAAGCCCC 
GAGCAGAGCC 
AACAAAQACC 

ATTTCCGGCT 
GCCAACArCA 
GACACCATGG 
OGGOQCCCTC 
TOTGOGATCA 
CTUGGAGAOQA 
TCCACiaXSGI 
TAGGAACATT 
GGGTTCGAAA 
ATCSACAACAC 
ATATCAAGC3T 



21 
I 

GGCA6AXGCA 
CAGCCAAGQA 
TCCTGCTTGC 
AGTGCAAGCC 
GIGOQGCGAC 
QCTACATAGT 
6GACAGCCAC 
ACGQCAATQA 
GACOCpTCAC 
GQGGCAQCAC 
CCATCATTGA 
TGIGTGCCAG 
TQGTCTGTAA 
CC0SAAA6CC 
TOAAQAACAA 
6TITGGTTOC 

TCTUSTGAGAC 
CTQCaTTTOTT 
TTCAATAAAT 



31 
1 

GAG6TTGAGG 
ACCrGGX3GOC 
TCTGGCAACA 
TCACTCCCAG 
GCrCATC2GCC 
TCACCTOOOG 
IGftSTOCTTC 
CaTCATOCTG 
CCTCTCCTCA 
OTCX3U3CCOC 
GCACCAGAA6 
OSTCaCAOGAA 
CCTGTCTCTT 
TGGTGTCTAC 
TTASACTaOA 
TGTTCACrCT 
CCTGGACTAC 
CTGGATTCAA 
CTCTQTTOTA 
ATTX6CTAAA 




GGGCTTGTAG 
CCCTGGCAGG 
CCCAOATGGC 
CAGCACAACC 
OCCCACCCCG 
OTOAAeATCG 
CQCTGTGTCA 

TGTGAGAA06 
GOGGGCAAGS 
CAAGGCATTA 
AOSAAAJSTCT 
GOCACCCACC 
CTTAATAAGA 
AG0AGATGCX 
ATTCTGOClT 
TCCCCAiKTOC 
TGAGTG 



FWAFTUiCBE 

vgcrltp6ly 
wlvaygvat 
fwahppgaqa 
kaqryrlire: 
kklltwesvh 

VQQCSRVIiSBr 



51 
I 

ACTGGAAlSrC 
CCCTCCAOGC 
GGGGAGAGAC 
CAGCCCTGTT 
TCXn^QACASC 
TGC3U3AAGGA 
GCTTCAA£»A 
CATCGCCAGT 
CTGCTGGCAC 
TGCCTCACAC 
CCTACCCCX36 
ACTCCTGCCA 
TCTCCTGGGG 
GCAAATA^CGT 
ACAGCCCATC 
AACCCT7UU3C 
GTCACTTAAT 
GAAATAXTGT 
CAAAGACAGC 



5aq ID KOr 56Q Protein cecfuance 
Protein Acceesicjii #x HP_0O6a44.1 

11 



51 



1 11 21 31 41 

I i I I I 

KRILQIiXIfLA LATGLVGSET RXJ1S3FECRP H9QFHQAALF BKT&UiCGAT T1IAPRNLK.TA 
ARdrKPRYIV HI/3QH£aiQKB BGCECTTRIAT BSFPEPGFHN SLRiIKDHRBD THIiVKMASF? 
eiT«AVHPLT LggRCVTAGT SCLISGWGST BSPQBRLEHT LRCftNITIIE HQKCBNAYPG 
XilinSTlWCAS VQBGGKDSOQ ODSGGFLVar OSLQGIIEWQ QDPCAXTSKP GVYTKVCaceV 



fieq ID MO» 561 DMA Bequeoce 
Nucleic Acid Accession ft; Ay04C419.1 
Godingr BetjueiiGez 1..1743 



1 
I 

ATCTTTACCT 
CXTGGGATCA 
GAj8CAGGAA& 
GGGGTCCrCB^ 
GOACTOCa?AA 
ATTGOCSVTAG 
ATTGCTCCTC 

TACavrOTTTG 
OCTCCAAGCC 
GGAAGOTTAA 
AAAGATGAAT 
OaATAATGA 
TTOTTCTATG 
CXCXSOCTCCA 
GTAGACCATG 
TTGC3TGACCA 
AGCCACAATT 
TCAACCAACA 

TCACtcavitsc 

CTAAATC3CT6 

GCivrxxxGA 

GGXCTAGQAC 
GGA6CCATG6 
TTTTTGACTG 
AGXCTAGCAT 
OAACAAAXAT 
AGXCAICACC 
CTCrrTGOAGT 
TAA 



11 
I 

TCCrrGTCATC 
TCTCTGGGGC 



TAOAiraGATA 
GCTTAGTCTT 
GQGTTTCCAT 
AACACAGAAG 
CXGGCTATAT 
OTCTT6TGAT 
CTGQGTTTCr 
GAGCACTCTC 
ATCA6TACAG 



21 
I 

TGTCACTOCr 
TCTTCITCA6 

ctccczgqtc 

TGGAAGAAG6 
GATCCTCACT 
CTOCKTCTCT 
AGGOCITCTT 



31 
1 

GCT{3TCAQTG 
ATCAAAAGCT 



ACA6CAAICA 
TTATCCTAjCA 
TOCATTGCCA 
GIGICACTGA 



CATCAACT6T 



TGOGCASCAA 
TGGaCATC(^ 
CTATCaUU3CA 
ACAATACTCT 
OCCTGAGAAA 
GAXTAAGOCA 
AATOGCTGTC 
CRATGCCCTB 
CTTTAACTTC 
TAACXOATCT 
CCOTOCTTTT 
CAATGGAGCT 
AAGAAGAATT 
GXAACAAGCX 



TCCCTTGG6A 
GGTGATGAAA 
AGATACAACT 
TXTIXGGOAT 
ACTASTATTT 
TTTGA3USTCA 
AGTOGtCAAG 
AACATTCCrC 
AAATCTCAAC 



CSGAGACC3U: 
TCJATGTGGAT 
CACTGAATAC 
CTTA6CCAGC 

TAGCATGAAC 
TATTOGCCTG 
XGTTGTTATG 
AGCAAAAGTG 
AGTOCCAAAA 
QTGTGGTAGG 



GTTTTGCAAG 
GOACSUUSAOG 
GAGGAACTCA 
CTGTXTCGXT 
TTTOTACAAA 
GIXGGRTTTC 
GTCATTAGCA 
TGCATTGGCT 

a:iocacatg3^ 

■rrtCAAAdSSGA 
AAGAOAGGGG 
CAGATA6TCA 
TPGCTTQTTT 
AGC3SZIGATCT 
TGGGQCATCA 
CCATGGGTGT 
TTTATACCTG 
AACTAIGTGA 
CAGCCTCAAA 
GC3CCAATCCA 



41 
I 

GQcrccraaT 

TATTAGCCCT 
TOCTEBOCTC 

CQOTTC^AT 

AIGAGCIGAT 
ATGTTTTOCA 
CAATTGCAAT 
GAGCTGCIAQ 
CTGTGATCAA 
CAAAAGACaUk 
TCACTGGGCA 
AAAGKAATGA 
OCATCCCIGC 
CCTCTOTGAT 
ACTTCACCC3V 
TTTATGSAOC 
TTTCTTCCCA 
AGACGAJCICTC 
CAGACCGTGG 
ATGITGCTGC 
TTOCIGGTGG 
ATCTQCTCAT 
GCTTTATATA 
AGACAAAQGG 
AAAAGAACAT 
AA2GAAAAGC 



780 
640 
90 D 
960 
1D20 
1080 
1140 



GO 
120 
IfiO 
240 
3O0 
360 
420 
460 
540 
600 
660 
720 
780 
d40 
900 
960 
1020 
3.000 
1140 



60 
120 
IBO 
240 



51 
I 

GQGTTATGAA 
GAGCIGOCAT 



CTGOCTGCIT 
AGTOGGaaSC 
CATOGCAGAG 
GATTGTC!A3<! 
TGGCTGQAAG 
6XATXXTCIX 
CAAG5TTCTT 
ATOCTCCCTG 
CATGCGGACC 
ACCAAACATA 
GGOUSQAGC 
CACTCTTCTT 
GGCAGCTT06 
TATCXGCAGA 
AGGAAACCTG 
TAGCAGAAGC 
AGCATCCTTG 
GGACGTCCCA 
TTTTTCAATT 
GAXCAGAGGA 
CTOSCTGACA 
TACAATICAXG 

XTGTZXTAT6 
CCAGGAGCAG 
TCCAGAGAOC 



SO 
120 
180 
240 
300 
360 
42D 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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Seq ID NO: S62 Protein sequence 
Protein AccesBlon fi: AAIi02327.X 

_ 1 11 21 31 41 51 

5 I t I I i 1 

MPTFIiSSVTA AVSOIiLVGYE IiGIIfiGAIiLQ IKTLLAIjSCE EQafWSSLV ZGftLLASIiTO 60 

6VI»IDRY6&R TAIIIiSSCLIi 6LGSI>VI.rZ^ liSYTVLIVGR IKJGVSlShB 5IA7CVYTAS 120 

TAPOHRTWSIJ* VSLNELHIVI GlIiSAYlSNY AFAXlVFHGtTK YMFGLVIPUQ VLQAIflMYPL IBO 

PPSPRFLW4EC CSQEOAASKVL GRLRAl^SDTT EELTVIKSSL KDEYQYSFHD LPRSKDNMRT 240 

10 RXni6I>TI*VF FVQITGQPNX IiFYASTVIilCS VGFQSNRAAS LASTGVOTVK ViaTIPATI«Ii 3 DO 

VEHVGSKTFIi CIGSSVMAAS LVTMGIWLH IHMNFTHICR SHNSZHQSLD B&VIVGPGHL 360 

GXnimTLEam FKGISSHSRS SXiKPIilWUVD KRQBTTSASIj LHAGLSHTEY QIVTDPGE>VP 420 

AFIiKMIiSIiAS IJ*VyVAAFSZ GIi6PHP1fLVI» SE1FPG51HG RAMAX>TSSHK HGlNlitiXSLT 4 BO 

^ FIiTVTDLIGI* PWVCFIYTIM SIASU^FWM FIPETKOCSL EQ18MBIAKV HYVKHHTCFK 54Q 

15 SHHOEEIiVPK OPQKRK?QEQ IiLGCNKLCXai GQ5RQLSFET 

Seg ID HDz S63 DHA sequence 

Vudeic Acid Accession. #x XH_o594^fi.l 

Coding sequences 1..8S4 

1 11 21 31 41 51 

I I I I i I 

ATGOMDCtaC GGGOSCTCSST CAOGGGGCTC AGCCTCGGCC TG3W3CCTOTQ CTCCCtGf3GG 60 

CTGCTOGTCA CGGOCATCTT CAOCQACCaC TGGTACGA(3A. OCGACCCCXJa GOGCCACAAQ 120 

25 G2U3»GCTGOe AGCX3CAGC0G OGOGGGOGCC fS^CCCCOyaa ACCAGAAGAA OC3GCCTGAT6 180 

OCGCTGTOQC AQCTGCCGCT GCXSaGACrrOe CC0CX3GCTGG GGCGCC33QCT GCZCCGGGGC 240 

GGCCC9GGGGC GGGCOGAOCC OGAGTOCTGG OGCTCGCTCC nGGGGCTOGG OGGGCTGGAC 300 

GCCQa<7rGGB QCOGGCCCCT CTTCGCCACC TACTCGGGCC TCTOOAiSaAA mtJCi'ACTTC 360 

CTGGGCATOG AOOGGGACAT CGACACCCTC ATCCTQAAAG GTATTGCGCA GCGATGCACG 420 

30 GCCATCAACSr AjQCAiTrTTlC TCAGGCCATC OGCTTGCGAA ACATTCCTTT TAA*nTAACC 480 

JUIGACCftXJ^C ASCAAGATQA QTGQCACCTe CTTCAXTTAA. GA&CAATCAC TGdGQCXTC 540 

CrOGGCATGS GOOTAGCaaT OCIICTCIGC GGCTGCATTG TGOCCACSUST CAOTTTCTXC €00 

TGGGAGGJVGA GCTTGACCCA GCACOTOOCT GQftCTCCTGT TCCTCRTGAC AGGGATATTT 660 

TQCACXS^TTT CCCrrCTGTAC TTATGGCSGOC AGTATCTCGT ATGATTTQAA CC3GGCXCCX3i. 720 

35 AASCTAATTT ATAGOCTGCC TGC7GATGTQ GAACATGGTrT ACASCTGGTC CATC7TTTGC 780 

QCCTGGTOCA OTTTAGGCXr TATTGTOGCA GCTGGAGGTC TCTQCATCQC TTATC06TTT B40 
ATTAjSCCGGA OCAAGATTGC ACTGCTAAAQ TCrOGCAGIUS ACICCACOGT ATGA 



40 
45 



Seqt ID NO; 564 Protein sequence 
Protein Accession #s J£P_059466.1 

1 11 21 31 41 51 

{ I I 1 I 1 

HEPRAIiVTAI> SH/JLSLCSUa ZJiVTAIFTDH HYETDPRSHK ESCBRSRASA DPPCQKHRZ^ 60 

PL&HIiPIiItDS PPLGRRLLPG GP6SADPBSW BSLLOLOObD AECGRPLFAT Y86LHRKCYP 120 

XjOidrdidtl luosiacHtcr xnaasBOpi rluhipssilt KnoQDEHRii wwjwsf lao 

LGM&V&VLLC GCrVATVSFF WEESLTW7A GLLFIJ4TGIF CTISLCIZAA BISZDLNRLP 240 
KXirYSLEADV EBQY5NSIFC AWCSLGFIVA AGGLCIAY]^ I9RTKZAQUK SGRDSXV 

SO Seq ID EiOs 5£5 EMA sequenGe 

IVacleic Add Accession fft Bob sequence 
flnfliTiq sequexice: 1..331S 

X 11 21 31 41 SI 

55 I 1 I I I 1 

ATGTGCTTTC GGGCBOOCA6 GCTCAGCATG AOQAACAOAA GGAATGACAC TCTGGACAGC 60 

AOCCGGIUXC TGTACTCCAS CGCQTCTCGG AGGACAGACT TGTCTTACAB TGAAAGCSQAC 120 

TITGGTGAATT TTATTCAAGC AAATT7TAAG AAAOQASAAT GrGTCTTCTT TAGCAAAGftT 180 

^ TCCAAGGCCA aaOAQaATXM OTGCAAGTCT GGCTATGCCC AGAGCCAGCA CATGGAAGQC 240 

60 ACCCAGA3CA ACCAAAGTQA GAAATGGAAC TACAAOAAAC ACAOCAAGGA ATTTOCTACC 300 

aACQt3C3:nx3 OdGAaATTCA GTTTQAGACA CTG8GGAAOA. AAOOGAAGTA. TA.TAC3QSCCIG 360 

TCCIGOGACA CGQACGCQGA AATOCTTTS^ GAGCVGCTGA COCAGCACTG GCACCTOAAA 42D 

ACACCCAACC TGGTCATTTC TGTGACC3SGG GGCOCCATUSA ACTTCQCCCT GAAfiCGGOGC 480 

ATGCQCAAGA TCTTCAGCC5S GCTCATCTAC ATOGOGCAGT CCAAAGOTGC TT«3QATirCIG S40 

05 AGGGGAGGCA CCCATTATGG CCTGATGAAG TACATCGGGG AGGTGGTGAG AGATAACACC 600 

ATCaflGAGGA tSSKMSbGGK GIATATIGTG GGCATEGGCA TOg CftOCTTQ aaOCASGStC 660 

TGC7UU30GGG ACACCCTCAT CAQQAATTGC GAXQClTGAGG CCTATTTTTT AGCCCA6TAC 720 

CTTATGGAT6 ACTTCACftAG AGATCCACTG TATATCCTGG ACAACAACCA CACACATTIG 7 BO 

CTeCrCGTGG ACAATGGCTG TCATGGACAT COCACTOTCQ AAGCRAAGCr CaSGAATCAG 840 

70 CTAOAC^AnGT ATATCTCTGA QCGCACTATT CAAGATTCX^ ACTAT06TGG CAAGATCCCX: 900 

AITGTGT6TT TTGCXICAAiGG A66TGGAAAA GAGACTTTQA AAfiCCATCAA XACCXCCIWrC 960 

AAAAATAAAA TTQCXTGTGT GGTGGIGGAA GGCTCX3GGOC A6AT08CTGA TGTQATCGCIT 1O20 

AGCCTGGTGG AGGTGGAGGA TGOGCTGACA TCTTCTGCOS TCAAGGAGAA GCTGGTGCSC lOBO 

TTTTTACCCC: GCAOGOTaXC CCGOCTGCCT GAG6AGGAGA CTGAGAGTTG GATCAAATGO 1140 

75 CTCAAAGAAA TXCTGGAAT6 TTCTCACCTA TTAACAQTTA TTAAAA3GGA AGAAGCIGG6 1300 

GATGAAAT^G TGAOCSIATBC CATCTOCIAG GCTCXATACA AAGGCTTGWG aVCCAGTaAa 1260 

CAAIGACAAGG ATAACTGGAA TGGGCAGCra AAGCTTCTGC TGGAGTGGAA OC3\(SCTGGAC 1320 

TTAGCC3A1G AK5AQATTTT CACCAATGAC C3GC30GATGGG AGTCTGCTGA 0CTTC3UM3AA 13 BO 

GTCATQTTTA CQGCTCrCAT AAAGGRCAJGA CC3CAAOTTTQ TCCXsCCTCTT TCTGGAGAAT 1440 

SO GGCTTGAACC XACGGAAGTT TCTCACCCAT GATGTCCTCA CTGAACTCTT CTCCAACCAC ISOO 

TTCaUSCACaC TTGTGTAGCSG GAAXCTQCAS ATGQCCAAGA ATTCXTFATAA TGATGCCCIC 1560 

CTCAOGTrtG TCTGGAAACT GGTTGOGAAC TTOCX3AAG2U3 GCTTTCCGGAA GGAAGACa^ 1620 

AATOaCGGGG AC3GAGATGGA CATAQAACTC CAaOAGQTGT GTGCXATTAC TOCGCACSCOC 1680 

CTGCAAGCIC IGTTGATCIG GOQCATitJXT CAGAATAAGA AG6AACTCTC CAAAGTCAIT 1740 
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15 

20 
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TGtSOAaCAGA 
CTGSGCAAAG 
TAGGAGACCC 
GAACAGCTGC 
GTGQAGGCCA. 
C&ATGCTAT6 
ATTATACCCT 

AATGTGGTCT 
CATTOGGTGC 
GATGAAGTGA 
ATOGACAOQC 
AATAAMVSCr 
CTRA1C3ATTGA 
CASAGGATCC 
TTTtSGCGKSG 
C6TT0GGTCA 
GGTACCAiCXSl' 
GTGGAGCTGG 
TGCATCTACA 
1:ACAC3GGTGC3 
CTOGTGCAOQ 
TTCTACATGG 
TCTGTCTGCT 
GAAAACTACC 
OGATTTAQAC 
AAl%A&ATaV 



CCACGGGCTG 
TGAAGAACGA 
GGGCTQTTGA 
TGGTCTATTC 
CAGAGCAGCA 
GAGAGATTTC 



TTTOGTACTA 
TCTACATCGC 
CRCACGCCCC 

TGOeStSCTTTT 
CTTTGTATTC 
TCCRCATTTT 
TCATOQATGT 
CCAGGCAAG6 
TCTAC3GAGCC 
ATGACTTTGC 
ATGAOCaCAA 
TBTTATOCAC 
GCAOOGTCCA 
AGTACTGCA6 
TGQTGAJ^AA 
GTTTCAAAAA 
TTGTCAABAT 
AlUrrGGATAC 
JUUCOA 



CACTCTGGGA 
CATCAATGCT 
GCTGTTCRCT 
CTGEGftAGCT 
TTTCATC3GCX: 

TGOCTTTSTA 
TGTGtSGGTTC 
CTTCCTCCTG 
CiSiVGCTGGTC 
OGTAAATOGQ 
TTACTTGATA 
TGGAOTAC^TC 
TfiCTGTAAGC 

GATGCITAG6 

CTAccn^cc 

CCAGIGCACC 
CCTGCCCCQG 
Cau^CA^TCCTG 
GGAGAACAAT 
CCXSOCTCAAT 



TGAAGACAAT 
CAACAG2UM 
AAAGCTTAAT 



GCTGQGGZVSr 
GAQTGTTACA 
TOG6GTG6AA 
CAGCXrrOGQQ 
AAC3AACTGGA 
TCATTTAGQA 
TTCRCCTCCC 
CTGTTTGcdr 
CTGTACTCGC 
GXGAATTATT 
6CA0QAATTG 
ATTTTCTGTC 
AGAAACTTAa 
CIGTTCCTCT 
CAGAATOAGC 
ATGTTCGGCX; 
TTCAl^GQGA 
TTCCCOGAST 
CTGGTCAAOC 
QACCAOGICT 
ATCCOCTTCC 
TGrlGCTGCA 
GAOACTCTOa 
GGCAAOGACA 



OCAGCAAGCr 
COGAGGAQCr 
GC3VGCGATGA 
GCAACTGTCT 
TCCAGAATTT 
AGATTATCXrC 
AiSAAACClY^T 

ACGTGCTGCT 
TG6TCTTTGT 
TTACTQAOCX 
TiVXTTOOGCI 
TGGACTACAT 
GACCCAIUGAT 
TT6CGGTQTQ 
AGOQCIGG&S 
AGGTGCCCAQ 
AT6AGICCAA 
GGATCACCAT 
TQCTGG^CGC 
GGAAGXTCCA 

ccTTcavrcGX 

AGQAOAAAAA 

OCTCAOAQQA 
GTCTTCTGAA 



TCTGAAGRCT 
GGCTAATGAQ 
AGACTTGGCA 
GGAGCTGGCG 
TCmCTAAG 



GGACAA0CAC 
CTTCTCCTGQ 
CRTGGATTTC 



GTGGAATOIG 
CCACTCTTCT 
TATTTTCACT 
TATAATGCTG 
QATGGTGGCC 
GTGGATATTC 
TQAiCGIGGAT 
GCCACTC3TGT 
CGCCCTGGTG 
CATGTTTGGC 
'OAOaTACTTC: 
CTTCGCTTAC 
CATGGAGTCT 
TGTCATGAAQ 
AATGAGGCAT 
AQAOATTQCT 



Seq m NO: S66 Protein sequence 
Fxoteln Acceaslon #t Eos sequence 



1 
I 

MSFRAARLBM 
SKAT&NVCKC 
SCDTCABIKtY 
riGGTH^GlMK 
t«D(DFTBDPIi 
lVCFi\QGGGK 
FLPRTV8RI.P 
QDKDHHNGQL 
GLNLRKFIiTH 



11 
I 

RNRRHDTLDS 
CYAQSQHMEG 
EEtJiTQHWHliK 
yiGBWKDNT 
YJIiDOaiailTHIt 
KXUCAINT6I 

DVIiTELFSNH 
BDVG&XTEmP 



21 
I 

TRTLYSSASK 
TQINQSEKWil 

I8R8SEEN7V 



lAKVKNDXVA 
VEATDQKPIA 

HKEitAnnrvAF 

DSVBQHYVIIO 
]:iRX«IHIFTVS 
RSVIYSPVIiA 
CIYMLSrail. 



QPGVONFIiSK 
FTSPFWFSH 
VNYPTDtiWHV 
BNLGPKIXHL 
MPGQVPSDVD 
LVNIiZjVAMFG 
CCCKBKMMB8 
UUCQUiKEIA 



KNKIPCWVE 
UCBZlaECGHL 
LAIIDBIFIMD 
F&TUVyRHZiQ 
LQALFIHAII. 
YETRAVELiPT 
QWnSEISRDT 
NWFYIAFEiZi 
MIXCLGLPyFI 
QSMLIDVFFF 
GXTTOFAHCT 
YTVOrVQENlY 
SVCCFKAlJtfUN 
HICEJC 



31 
I 

&TDL&YSESD 
YKSETKBFPT 
GMCNPALKPR 
AIQIAAHQHV 
PTVEAKLRNQ 
GS6QIAZ>VIA 
U:VIKHEBAG 
KKHBSADLQB 
XARHSSUDKb 
QHKKEIiSKVI 
ECY8GDSDLA 
KH17KIII.CI^ 
IfAYVUMDF 
AGIVFRXSSS 
I«FIf AVWHVA 
FTONESKPIiC 
SQVmCFQRYF 



41 
1 

ItVNPlQANFK 
DAFGDIQFBT 
K RKIFS RMY 
SMRSTIiXREIXC 
LEKYISERTI 
£LV£VEDALT 
DEIV9HAISY 
VMFTALIKDR 
LTFVHKZiVAll 
WBQTRGCTLA 
EQIiliVYSCBfl 
IIPLVGOSFtf 
HSVFHPFBHiV 
HK8SI.Y&GKV 
FGVABQGILR 
VELDESBHLPR 
IiVQBYCSRIiN 
ENYLVKOITK 



180O 
1860 
1920 
19B0 
2 040 
21D0 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2S40 
3000 
3060 
3120 
31BD 
3240 
3300 



51 
I 

KRECVFBTKD 
LGKKGKXIRI. 
ZAO&K)GMrri> 
BIUEEGYFIiAQY 
QDSH^CGKIF 
SSAVKEKLVR 
AIiYKAFSTSB 
PBCFTRLFliEtl 



ALGASKtliKT 
HQGfiHCLELA 
&FRKKFVDKH 
LYSZiVFVIiFC 
IFCLDYIIFT 
GVOEQRMRWIF 
FPEWITIPi:.V 
IPEPFIWAY 



60 
120 
IBO 
240 
300 
360 
420 
4BD 
540 
600 
660 
720 
760 
940 
900 
960 
102O 
lOfiD 



seq ID NOs S67 DNR geQuence 

nucleic Acid Accession #i NM_00€911.1 

Coding sequooces l.,558 



^_ 1 11 21 31 41 51 

60 I I I I I 1 

atgcx:tcgcx: TGTTCITGTT CCACXTIGCTA GAATTCTGTT TACTACTQAA CCAATTTTOC 60 

AG7USCA6TOG 0^KX3^TG OAAQQACCSAT (STTATTAAAT TAXGOGGCOG CGAATTAGTT 120 

OGCBCZOCAQA TTGCCATTTG OGGCAIGAGC AOCTGGAGCA AAAGGTCTCT GAGCCAGGAA 190 

GATGCrOCTC AGACACCTAO ACCAGTOGCA GAAAXXGTAC CATOCTTCAT CSU^CAAftGAl 240 

05 ACAGAAACtA TAAXTATCAT GTTOGAATTC ATTGCKAATT TGCCAiOOGGA GCTGA2UGGCA 300 

GOCCTATCTG AOAGGCAACC ATCATXACCA GAGCTACAGC AGTATGTACC TGCATTAAAG 360 

QATTCCRATC TTAGCTTTGA AGAATTTAAG AAACITATTC GCAATASGCA AAGTGAAGCC 420 

GCAGACAGCA ATCCTTCAQA ATEAAAATftC TTAGGGTTGG ATACXCATTC TCAAZIAAAAG 4B0 

AQACJOACGCr ACGTOGCACV GTrXGUGMA TGTTGCC3M TTOGTTCmC CAAAAOGTCT 540 

70 CTTGCTAAAT ATTGCTGA 



75 



80 



Seq ID IStOi 568 Protein gequence 
Protein AcceBSioxi tt: 1!]P^008B42 . 1 

1 11 21 31 41 51 

I 1 1 1.1 I 

MFRLFIjFHIiI. EFCI<IiIiHQP8 RAVftAKHKDD VIKLCSRHLV RAQIAIGGHS XW6KR&L8QB 60 
DAPQTPRPVA BIVPSFINKD TETZIIMIiBF Z2^NI>FPEIiKA AIiSSRQPSIiP EUQaYVFAI.K 120 
I^SHLSFEBFK KLIRKRQSBA ADSHPSBLIOr 7.GU3THSQXR RRFYVALFBK CCLIGCTKRS IBQ 
IiAKYC 

seq ID KO: 569 DMA aeguence 

Xhiclelc Acid Acceaalon #i XM_036453*1 
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Coding sequence; l__397B 

1 li 21 31 41 51 

! I I I I I 

ATGCTGCCCG TCTAlOCAGaA GQTGAAQCCC AACCC3GCTGC AQGACCCGAA CCTCTGCTCA 60 

CGCGTGTTCT TCTGGTGGCT CARTCCCTTG TTTAAAATTG GCCATAAACG (3AGATTA6AG 120 

G3UW3ATaATA TGTATTCAQT aCTQCCWSAA GACCGCTChC ASCACCTTGG ASAGOAQTTG ISO 

CAAGGGTTCT G06ATAAA6A. ASTTTTAAGA GCTOAQAATIQ ACGCRCAGAA GC3CTTCTTTA 240 

AC?Uy3flGC2iA TCATAAAGTC TTACTQGAAA TCTTATTTAG TTlrGGGAAl* TTTTACGTTA 300 

ATK3AOI3AAA GTGCXMAfiT AATCX^COC: ATATTTTTGG 6AAAAATTAT TAATTATTTT 360 

GAAAATTAXG ATCCCATGGA TTCTGIGBCC T7GAACACAS C36TACBCCTA TQCCAOGCTG 420 

CTQACTTTTT GCAOGGTCAT TXTQGCTATA CTGCATCACT TATATTTTTA TCAOQTTCAQ 4BD 

TGTGCTGGGA TGAGOTTACX? AGTAGCCATG •rGCC!ATAT€3A. TTTATCaOAA GOCaCTTaST 540 

CTTAGTAACA TGGCCAT<3GS QAAGACAAOC ACAGGCCAGA TA6TCAATCT GCXGTCCAAT 60 Q 

GATGTGAACA AGTTTOATCA GGTGACAGTG TTCTTACACT TCCTGTGGQC AQQACCACTG 660 

CAGGDGATCQ CRCTGACTGC OCTACTCTOS ATGGftGATAB GAATATOGTG OCTTGCTOflG 720 

ATGGCAGTTC TAATCATTCT CCTGOOCTTG ChAflfiCTCTT TIGGGAAQTT GTTCTCaTCA 780 

Cl:GAQ(3AaTA AAACTGCAAC 1^CA£X3GAT GOCA^^TCA OGACCATGAA TGAAGTTArTA 840 

ACTGGTATAA GGATAATARA AATGTA06CC TGGGAAAAGT C&1TTTCRAA TCTTATTAOC 900 

AATTTC?AGAA AGAAGGAC3AT TTCCAAGATT CTGAGAAGTT OCTGCCTCAG GGGGAOCOAAT 960 

TTOGCrrCGT 7TTTCA6TGC AA6CAAAATC PfPCSaa&TTEQ TQAGCTTCAC CACCTAOGTO 1020 

CTCCTCGGCA GTGTGATCAC AGCC3iaCCX3C C3TGTTOGTGG CAGTGACGCT GTATGGGGCT 1080 

GTGOGGCTGA OGGTTACCCT CTTCTTCOOC TCAGCCATTG AGAGGGTGTC AGAGGCAATC 1140 

GTQWQCATCC <3AACSAATCCA GACCTTTTTQ CTACTTQATG AOATATCACA GCGCAACCGT 1200 

CAOCXGCCGT CAGATGGTAA AAAGATGffTG CATGTGCAQG AITTTACTGC TTTTTGGQAT 12«0 

AAGGCATCAS AOACCX7CAAC TCTACAAOGC CXTTOCTTTA CTOTCAGACC TGGOGAATTG 1320 

TTAGCTGTGG TOC9GCXXX3GT GGOAESCAQGG AAQTCKTCAC TCTTAAGTGC OSTGCTOGGQ 1300 

GAATTGGCCC aW3TCA030 GCTGGTCAGC GTGCATGGAA GAATTQOCTA TQTGTCTCAG 1440 

CACCcilrrGaa tcttctcggg aactctgagq aqtaatattt tatttgggaa gaaatacgaa isoo 

AAGGAAOSAT AIGAAAAAOT CATAAAGGCT TGTGCTCTGA AAAflaSATTT ACAGCTOTTG 1560 

G?VOGAl'QGTa ATCTGACT6T GAIiftfiGAlGAT CaGOGAACCA aaCIQAGTQG AGGGCRGAAA 1620 

GCACGGGTAA ACXTTTGCaUiQ AQCftGTGTAT CMGATGCT6 ACATCTATCT CCTGGACBAT 1680 

OCTCTCAQTG CAGTAGATGC GGAAGTXAGC AOACACTTOT TCBAACTGTG TATTTGTCAA 1740 

ATTTTGCATG AGftAOATCAC AATTTTAGTG ACTCATCAGT IGCAGIACCT CAA7U3CTGCA 1800 

AGTCRQATTC TGATATTGAA AGATGGVAAA AIGOTGCAQA AQGOGACTXA CACTGAGTTC 1860 

CTAAAATCTG GTATAfiATTT TOGCTCCCTT TTAAA6AAGG AIAATGAGGA AAGXGAACAA 1930 

CXITCCAGTTC CAGGAACTCC GACACTAAGG AATOGTACCT TCTC3U3?U5TC TTGOGTTTGQ 1980 

TCTCAACAAT CTTCTAGACC CTCCTTGAAA GATGGTGCIC TOGAGAGCCA AGATAC3W3AJ3 2040 

AATGTCCC3U3 TTACRCTATC AGAGGAGAAC CGTTCrOAAO GAAAAGTTGG TTTTCAGOCC 2100 

TATAAGAATT ACTTCAGAGC TC3GTGCTCAC TGGATTGTCr TCATTTTCXn? TATTCTOCTA 2160 

AACSVCTGCAQ CTCAOGTTGC CTATGTGCIT CAABATTGQT GGCTTTCATA CIGGGCAAAC 2220 

AAAC AAAGTA TGCTAAATQT CACTGTAAAT GGAGGAGGAA AXGTAACC3GA GAAOCTAQAT 2280 

CtTRACT Qgr ACTTAGGAAT TTATTCiUSGT TIAACTGTAG CTAOOSTTCT TTTTGGCATA 2340 

GCAAGATCTC TATTGQXATT CTAOGTCCTT 6TTAACTCTT CACSVAACTTT GCACAACAAA 2400 

ATGTTTGAGT CAATTCXGAA AfiCTCOOGTA TTATTCTTTG ATAGAAATCC AATAGGAAGA 2460 

ATTTTAAATC GOTTCTCCAA A6ACATTGGA CACTTGGAXG ATTTGCXGCC GCTQAGGITT 2530 

TTAGATTTCT. TOCAGACATT GCTAJCAAQTQ -GTTGGTGTGG TCTCTGTOGC TGTG6CG6TG 2580 

ATTOCTTGGA TGGCAATACC CTTGGTTCCC CTTGGAATCA TTTTCATTTT TCTTCC3GCGA 2640 

TATTTTTTGa AAAOGTCAAG AGATGTOaAa CGCCTGGAAT CTACAACTOG GA6TCCAGTG 2700 

TTTTCOCACT TGTCATCTTC XCTCCAGGGG CTCTQGACCA TGCGQGCA13L CAAAiSCASAA 2760 

GAGAGGTGTC AGGAACTGTT TGATGCACAC CAQOATTTAC ATTCaaAQGC TTGGTTCTTG 2B20 

ITTTTOACAA CGTCCCJGCTQ GTTCOCCGTC OCTCIGGAT6 CCATCTGTGC GASGTOTOTC 2880 

ATCaVraSTtG CJCTTTOGGTC CCTGATTCTO GCaUiAAACTC TGQATGCCXSQ GCfflGGTTGGT 2940 

TXiaaCACTGT OCTATGOCCI CAOSCICATO GGGATGTTTC AGTGGTGTGT TOGACAAAGT 3000 

GCTGAAGTTG AGAATATGAT GATCTCAGTA GAAAGGGICA TTGAATAGAC AGACCTTQAA 3060 

AAAGAA0CAC CTEGGGAATA TCAGAAAOGC CCRCd^CCMB CCTGGCCCCA TGAAGGAGTG 3120 

ATAATCrrrG ACAATGTOAA CTTCATOTAC A61CC»0G*EG GGC3CTCTGG1' ACTGAAGCSCT 3180 

CIGACAQCAC TCATTAAATC AC3UVSAAAAG GTTGGCATTG TGGGAAGAAC OGGAGCTGGA 3240 

AAAKSTTCOC TCATCTCAGC CCTTTTTAGA TTGTCAGAAC COGAAGGTAA AATTKMRTT 3300 

GATAACATCT TGACAACTGA AATTGGACTT CKCdAlTTTAA GQAAGAAAAT GTCAATCATA 3360 

OCTCAGGAAC CIGTTTTGTT CACIXSGAACS^ ATGAGGAAAA ACCTGGATOC CTTTAATGA6 3420 

CACAOSGATG AGGAACKTTQ OAATGCCTTA CAAGAGGTAC AACTXAATWSA AACCATTGAA 3480 

GAXCTTCCIQ -fTrAAAMGGA TACTGAAITA QCAGAATCAG GATOCAATTT TAGTGTTG6A 3S40 

CAAAGACAAC TQGZGTGCCr •nSCK^AGOGCA AITCTCAGGA AAAATCAGAT ATTGATXATT 360D 

GATGAAlSOGA 06GC31AATGT GGATCCAAGA ACTGAIGAGT !EAA1ACAAAA AAAMTCCGG 3660 

GAGAAATTTG CCCACTGCAC OGTGCTAACC ATTGCACACA GATTGAAOU^ CATTATTGAC 3720 

AGC3GftCAAGA TAATGGTTTT AjQATTCAGOA AGACTGAAAG AATATGATGA GCCGTATGTT 3T80 

XTGCieCA2lA ATAAAGAGAG CCTATTTTAC AAGATGGTGC AACAACTOGG CAAOGCA6AA 3840 

GCC3GCTGG0C TCACIOAAAC AlfSCAAAACnG GTATACTTCA AAAGAAATTA TCCACATATT 3 900 

GGTCACACTG ACCACKTGGT TACAAACACT TCCAA3GGAC AGOCCTOGAC CTTAACTATT 3960 
1TOQAGACA6 CACTGXQA 



Seq ID MOs ?70 Protein Begusnce 
Protein Accession #i XP_03fi4S3.1 

1 11 21 31 41 51 

I I I 1 I I 

KLEVVQBVSP HPLOPABIiCS RVFFWHUIPI. FKIGHKSRLC: EODHXSVSJeE DSSQaLGEEL 60 
QGFHIMCSVZJl AENDAQKPSL TRAIIKCYHK SY£.VZdGIFII> ZBB8AKVIQP IFLGKIINYF 120 
KNYDPtmSVA LBITAYAYATV IiTPCTKIIiRI LHHIjYFYHVQ CfiSHHURVAM CRMmiKALR 180 
LSNMA}fi»KTT TGQIVNLLGbT DVNKPDQVTV FLHFLHAGPIj QAIAVTAIiIjW MEIGISCXAG 240 
MAVLZXL&Pli QSCFGKIk^SS liRSKITAXPZD ARIRTKKIEVl TGIRIIia>«YA KSKSFfiHLIT 300 
HL&XKSISKI IiBSSCEiRQKIir IA8FF9A5KX ZVFVTFlrTYV LliGSVI'TASR VFVAVTLYGA 360 
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VKLTVTEiPPP 
KASBTPTLQG 
QFHVFSQTIjR 
ARVHLARAVy 
SQILIIiKDGK 
8QQSSRPS3LK 
NTAAQVAYVL 
ARSIJiVPYVIi 
LDFIQTLLQV 
rSELSSSLQa 
IIVAFGSLIL 
KEAPWEYQKR 

HTPEELmSAL 
FETAZ> 



SAIEKVSfiAX 
IiSFTVRPGBL 
SNILFGIOCyE 
QDRDIYLLDD 
MVOKOTYTEP 

QDwni.synAN' 

VnSSQTIiHNK 

IiHTIIlAYKAB 
AKTLDAOQVG 
FPPAHPHEGV 
L8EPBSKIWZ 
QBVQLKBTIE 

KMVQQU5KAB 



VSIHHJlQTFL 
lAWQFVGAG 
KERYEKVIKR 
PLSAVDAEV9 
UCSGIDFGSli 
KVPVTIjSEBK 
XQ8HLNVTV29 
HFBSlIiKAEV 
IPHXAIPLVF 
ERiCQSbPDAH 

DIMKMDTEL 
EKFAHCTVZiT 
AAALTETAKQ 



LLDEISQRHR 
KSSLLSAVX.G 

RHLFELCIOQ 
LKKDNEE&EQ 
R9ECSKVGFQA 
GG(3JVTEKliD 
LFFDHHPIGR 
LGIIFIFLRR 
QDLHSEAWFL 
GMPQWCVRQS 
SPGQFLiVIiKH 
HDLRKKKSII 
AESOSHFaVG 
lAHRLHTIZD 
VYFiEREn-FEl 



QLPSDGKKMV 

EDGDL-TVIGD 
lUTEKITIliV 
PPVPGTPTLR 
YKNYFHAGAH 
LSIVfYLGIYSG 
IlililRFSKDIG 
YFLET&RDVK 
FLTTSRWEAV 
AEVEKMMISV 
LIALIKSQEK 
PQEPVIjFTQT 
Q&QLVCLARA 
SDKIMVUDSG 
CaCTDHMVTNT 



EVQDFTAFWD 
VHGRIAYVSQ 
RGTllfSOGQX: 
THQLQYIiKAA 
NRTPSBSSVW 
WrVFIFLILr* 
LTVATVLFGI 
HtJlDIrLPIjTF 
HLESTTKSPV 
RXiDAZCAMFV 
ERVIEYTDLE 
VGTVOaTQAG 
MRKNLDPFNB 
IXSKHQIIiII 
SXiKBXDEPYV 
5HGQPSTLTX 



420 
480 
540 
600 
660 
720 
780 
GdO 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



8eq ID KOs 571 CTtA geguence 
NUcXelc Acid AcceflBlon AF07120 
Coding segufince $ 116 - - 4 093 



51 



1 11 21 31 41 

I 1 I t I I 

GGACASGOGT QGCBGCC3QQA GOCCCAGCAT CCCTGCTTGA QC3TCCAGGAG OGGAGCCCXSC 60 

BQCCZACCQOC GCCT6ATCAG C(5CC3ACOCX33 GOCOGCJGCCC GCCCOQCXCG QCAAGATGCT 120 

GCCCSTOTAC CROQRiSGTGA AiSCOCAAOOC QCTaCAQORC QOGAACATCT GCTCACQCOT 180 

GfttCTTCIGG TGGCTCAATC CCTTGXTTAA AATTGGOChT AAACOQAQAV TAGAGGAAGA 240 

TGATATGTAl? TCASrSCTGC CAGAAGACCG CTCAiClUSCAC CTTOGAGAGO A6TTGCAAGG 300 

GTTCTGGGAT AAAGAftGTTT TAAf5AGCTGA GAATGAOQCA CROAAGCCTT CTTTAACAA© 360 

AQCAATCATA AAGTGTTACT GGAAATCTTA TTTAGTXTTG GGAATTTTTA OQITAATTGA 420 

GGAAAGTGCC AAAGTAATCX! ASCCCATATT TTTGOGAAAA ATTATTAATT ATTTTQAAAA 480 

TTATOATOCC KSGSKT^CXG TG6CITTGAA CACAQGQTAC GGCTATGCXZA GOOTGCTOAC 540 

TTTTTGCAC6 CTCATTTTBG CTATACTGCA TCACTTATAT TTTTATCAOS TTCAGTGTGC 600 

TQC^GATGAGG ITTAOGAGTAG CCATGTGOCA TATGATTTAT OOGAAGGCAC TlGGTCTTAG 66 D 

TAACATGGCC ATGGGGAAiaA CAACCACAGG CCAGATAGTC AATCTGCTGT CCAATGATOT 720 

QAACAAGTTT GATCftGGTGA CAGTGTTCTT ACACTIOCTG TGGGCAGGAC CACTGCAGGC 780 

GATOSCAGTQ ACraCCCTAC TCTGGATOGA GATAOGAATA TCSIGOCITG CTGGGAT60C 840 

AGTTCIZUkTC A1TCIOCTGC CCT7GCAAAG CIGTOCITGGG AAGTT6TTCT CATCACTGAG 900 

6AGTAAAACT GCAACTTTC!ft C!GGATGGCAG GATCAGGACC ATOAA-TGAAG TTATAACTGG 960 

TATAAGGATA ATAAAAATGT ACGCCTGGGA AAAGTCATTT TCAAATCTTA TTACCAATTT 1020 

GAGAAAGAAG GAGATTTCXIA AGATTCXGAG AAGTTCCTGC CTCAGGGGGA TGAATTTGGC 1080 

TTCGTTTTTC AGTGCAAGCA AAATCATCGT BTTTGIGAGC TTCACCTOCT ACGlGCrCCT 1140 

OGGCAGTOTO ATCACAGCCA GCCGOGTGTT OGTGGCAGTG ACGCTGTATG GGGCTGTQOG 1200 

QCXOACEtSTT ACt3CTCITCr TCCOCTCAGC CATTQAGAGG GTGTCAGAGG CTATCS3TCAG 1260 

CATCOSAAGA ATCCAOAOCT TTTTOCTACT TGATQAGAXA TCACAGCGCA AGGGTCAGCT 1320 

GCOGTCAGAtr GGTAAAAAGA TGGTGCATQT GCAGGATTTT ACTGCrTTTT QGGATAAGiGC 1380 

ATCAGA0ACC CCAACTCTAC AAOQCCTXTC CTTTACTGTC: AGACCXOGOG AATTGTTAGC 1440 

TGTGGTCX36C CC0GTGGGA6 CAGGGAAQTC ATCACIGTTA AGTGCGG3Y3C TOSGGGAATT 1500 

GGOCCCAAGT CAGOaGCTGG TCAGCGTGCA TGGAAQAATT GOCTATGTGT CTCAGCAGCC 1560 

CTGGGTGTTC TOGGGAACTC TGAGGAGTAA TATTITATTT GOGAAGAAAT ATGAAAAGGA 1620 

ACGATATGAA AAAGTCATAA AGGCTTGTGC TCTGAAAAAG GATTTACAGC TGTTGQAGQA 1680 

TGGTGATCTG ACTOTQATAG GAQATCGGGG AACCAQGCTG AGTGGAGGGC AGRAAGCAOG 1740 

QGTAAACCTT GCAAGAGCAG TGTATCAAGA TOCTQACATC TATCTCCTGG ACDArCCTCr ISOO 

CAG7GCAGTA QATQCOGAAQ TZAGCAGACA CrTGTTCGAA crGTSCAOm GTCAAATTTT I860 

GCATGAGAA6 ATCACAATTT TAGTGACTCA TCAGTltSCAG TACCTCAAAG CTGCAA6TC& 1920 

GATTCTGATA TTOAAAGATQ GTAAAATGGT GCAGAAGGGG ACTTAOUCrG AGITCCEAAA 1980 

ATCTGGTA*J!A GATCTrGGCT CC5CTTTTAAA GAflGGRTAAT GRGGAAAGTQ AACAACCTOC 2040 

AGTTCCAGGA ACTCCCACAC TAAGGAATCG TACCTTCSCA OAGTCSTOGG TTTGGTCTCR 2100 

ACAATCTTCT AGACCCTOCT TGAAAGAIGG TGCTCTGGAG AGCCAAQATA CSiGAGAATGT 2160 

CCXZAGTHACA CTATCAGAG6 AGAACOGTTC TOAAGOAAAA S CTGGCT TTC AOaCCXKTAA. 2220 

GAAITACTTC AGAGCTGGTG CTCACTGGAT TGTCTTCATT TTCCTTATTC TCCTAAACAC 2280 

TGCAGCTCAG GITGCCTAT6 TGCTTCAAGA TTGGTGGCTT 7CATACIGGG CAAACAAACA 2340 

AAGTATOCTA AATQTCACTG 1:aAAIGGAGG AGGAAATGTA AC!CGftGAAGC TAGATCTTAA 2400 

CTGGTACITA GGAATTTATT CAGCnTTAAC tSTAGCtAOC GTTCTTTTTG GCATAGCAAG 2460 

ATCrCTATTG GTATTCTTACG TCCTTGTTAA CTCTTCACAA ACTTTGCACA JbCAAAATQTT 2520 

TGAGTCAATT CTGAAAaCTC OGOTA.TTATT CITTG»TAGA AATCCAATAG GAAGAATTTT 25BD 

hPXXCG-CU'EQ TCCAAAGACA TTGGACACTT GGATGATTTG CIGCGGCTGA OOTTTTTAGA 2640 

TTTCATCCAO ACMTGCTAC AAGTGQTTGG TGTGGTCTCT GTOGCTOrGG COGlGATTCC 2700 

TTGGATCX3CA ATAOCCTTOQ TTCCCCTTGG AATCATTTTC ATTTOTCTTC GGCGATATTT 2760 

TTTGGAAACG TCAA6AGAT6 TGAAGCXSCZCT GGAATCTACA ACTOGGAGIXC CAGTGTTTTC 2B30 

OCACTTGTCA TCTTCTCICC AGGGGCTCTG GACCATCOQG GCATACAAAG CAGAAGAGAG 2B60 

GTGTCAGGAA CTGTTTGATG CACACC3«3aA TTTACATTCA GAGGCTTGGT TCTTQTTTTT 2940 

GACAACGTOC CGCTGGTTCG CCSSTOOGTCT GGATGCXS^TC TGTGCCATGT TTGTCATCAT 3000 

CGrTGCCTTT GGGI P OCCTGA TTCTGMiAA AACTCIGGAT GCOGGGCAGG TTGGTTIGGC 3060 

ACTGTOCXAT GCCXTTCACGC TCATOGGGIAT OTTTCA0IGG TGIGIXCGAC AAAGTGCXGA 3120 

AGTTGAGAAT ATQAT8ATCT CAGT3U3AAAG G6IC%TIGAA TACS^CAGACX: TTGAAAAAiGA 3180 

AGCSICCTEGG GAATATCAGA AAOGOOCACC AQCAGCCTGG OOCCATGAAG GAGTGATAAT 3240 

CTTTGACAA:r GlGAACTTCA TQTACAGTCC AGGTGGGCCT CTGGTACTOA AfiCATCTGAC 3300 

AGCACTCATT AAATCACAAfS AAAAGGTTGO CaVTTGTGGGA AGAAOOGGAG CTGGAAAAAG 3360 

TTCCCrCAIC TCnSGOCTTT rCAGAITQIC AmuU3CGGAA GGXAAAATIT GGATTGATAA 3420 

GAICXTGACA ACTOAAATTQ GACTTCAOGA TITAAG6AAG AAAATGTCAA TCATACCTCA 3480 

GQAACCTGTT TTGTTCACT6 GAAOUUGAG GAAAAACCX6 GATOCCTTZA AGGS^GCACAC 3540 

G6ATGAG8AA CIGIGGAATG CCTTACAAGA GGT!ACAACTX AAAGAAAC3CA TIGAAGATCT 360O 
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TGCXGGTAAA 
ACAACTGGTQ 
MXGhOSGCh 
ATTTOCCCAC 
CAAjGATAATG 
GCAAAATAAA 
TGCOCTCACT 
CACTGRCCAC 
GACBfiCACTG 
TTTGGACTAT 
CAJUaATGCTA 



ATGGATACTG 

AATGTGGATC 
TGCZACOBTGC 
GTTTTAGATT 
GASAGCCTAI 
6AAACAQCAA 
ATG(3TTACAA 
IGAATCCAAC 
QTAAACCACA 
GtMCATTTGA 



AATTAGCAGA 
GGGCAATTCr 
CAAGAACTGA 
TAACCATXaC 
CAI3QAAGACT 
TTTACAAGAT 
AACAGGTATA 
ACACTTCCAa 
CAAAATGTCA 
TTGTACTTTT 
ATATTTCTC3C 



ATCAOOATCC: 
CAG6AAAAAT 
TOAGTTAArA 
ACACAGATTQ 
GAAAGAATAT 
GGT6CAACAA 
CTTCAAAAGA 
TGGACAGCC3C 
ASTCCGTTCC 
TTTTACTTTG 
C 



AATE7TAGTG 
CAGATATTGA 
CAAAAAAAAA 
AACACCATTA 
GATGAGOC^ 
CI6GOCAAOG 
AAmATCCAC 
TOGACCTTRA 
GAAGGCATTT 
GCAAC3UUITA 



Tl'GGACAAAG 
TTATTGATGA 

TCGOQGAGAA 
TTGACftGCGA 
ATGTTTTGCr 
CAGAAGCCGC 
ATAlTGGTCA 
CTATTTTCGA 
TCCACTAGTT 
TTTATACATA 



36fiO 
3720 
3780 
3B40 
3900 
3960 
4020 
408D 
4140 
4200 



Seq ID KOi 572 Protein aegaence 
Protein Accesslcm. #s 2UkC27076.1 



1 
I 

MbFVYQEVKP 
EKYDPHDSVA 



11 

1 

NFIiODANICS 
AEtiIDAiQXP3L 
U9TAYAYATV 
TQQIVIilU#SN 



mavj:iXXi:.x>pi> 

NIiRK£QSI&KI 
VRbTVnjPPP 
KASETPTLQG 
QP1IV7SBIIi& 
AXtVMIARAVir 
SQILZLKDGK 
BQQSSRPSIiK 
NTAAQVAYVI* 
AHSItLVFYVIi 

FSHlSSSIiQO 
I2VAFG&LID 
KEAFHBYQKR 
KSSLISShlrFR 



LRSSCliRtSMII 
SAIBRVSEAI 
IiSPTVKPGBIj 



QDADIYIiLDD 
KVOKSTITTEP 
IXSALSGQDTE 
QDWHIiSyWAN 
VEI8SQICBHK 
VGW0VAVAV 
LWTIRAYKAB 
AKTUMGQVa 
PPPAffPHEGV 



21 
1 

RVPFWWLKPI. 
TRAIIKCYWK 
LTFGTIilLAI 
DVNKFDQVTV 
LR8KTATFTX> 
I>A£FF&A£KI 
VSIRHIQTFI* 
XiAWGPVGAG 
KBRITEKVIKA 
PlkBAVTAEVS 
IJCSGIDFGSL 
NVFVTEjSSSN 
KQSMLNVTVN 
KPESIZaKAFV 
IPWIAIPLVP 
EROQEriFDAR 
LAliSYALTLM 
XIFDNVMEMy 
DKILTTBIGIi 



31 

\ 

FKIGHKRRBE 
SYTiVLGIFTL 



ARIHrOQIEVI 



LLDBISQRNR 
KSSIiIiSAVIiG 

cauuaxbQLii 

SEIItFELCZCX? 



RSSGKVGFQA 
GGGHVTBKLiD 
IiFEDRNPIGR 
LGIXFIFEillR 
QDUISEAHFIj 
GMFQHCVSQ8 
fiSGGPLVXiKH 
HDIiRKKKSXI 



DEATANVDPR 
XjI^OKKBSIiFY 
FETAL 



QBVQUKBTXB 
TDEIiXQKKIR 
KMVQQLGKAE 



EKFAHCTVIiT XARSZdiTXID 
AAAUTBTAKQ VYFKBNyPHI 



41 
I 

EDDKYSVliPE 
lEESAECVIQP 
CA6HRUCVAM 
QAIAVTALLW 
TGIRZIKKITA 
LLGSVITASR 
QISPSDGSIXMV 
BLAFSHGXiVS 
EDGDLXVI^ 
IliBEKITIliV 
PPVPGTPTIiR 
YKNYFHAGAH 
LSIWYLGXYSa 
XmSFSKDIG 
YFI»ETBRE>VK 
FLTTS&HFAV 
AEVEHMMISV 
IAAI»£K&QEK 
PQEFVItFTGT 
QEtOIiVCUUiA 



51 
1 

DRBQHLGEEL 
IFLQKirMYP 
CHKXy&ICALR 
!4EXGISC3iA6 
KBKSFSNIiIT 
VFVAVTI.YGA 
HVQDFTAPWD 
VHGRXAYVSQ 
BGTTXiSGQQiK 
XHQI^YLKAA 



WIVFIFLXXJ* 
IiTVATVlfPGI 
EUSDUUPLTP 
RIiESTTRfiPV 
RLDAICAKFV 
ERVIS^rmliB 
VGIVGRTQAG 
HRXHI13PFKE 
lUaq gQIIiX I 
RL1CE3CDBPYV 
StSPSOPSTDTX 



60 
120 
ISO 
240 
30D 
360 
420 
4Q0 
540 
600 
660 
720 
7B0 
84D 
900 
960 
1020 
1080 
1140 
12QD 
1260 
1320 



Seq ID NO; 573 DMA seouence 
NUcXeie AO id Accession #t Eos 
Coding sequence b l . . 1365 



1 
1 

AUGGRATCAA 
GGCATAAATG 
TTTGCOVAAr 
AGftAATCCTA 
GATGCTCTCA 
CIGXGGGACC 
AGOATfJ^ACC 
TTGATIGTCA 



CTTGCOOGCG 

attgaaaatt 

AGCTIGGCCA 
AGAAACCAAC 
ATAGTTGG^ 
CAAC^TTATT 
TOTAGAAAAC 
CTCTGCTTAC 
GTTCArcGCAA 
ATCTOCTTTO 



GAAGAGTACT 
ATTGTAA'^C 



IX 
I 

TCrCTATQAT 
GTATCAM^GA 
CCTTGACCAT 
A6TITGCTTC 
CAAAAACAAA 
TGAGACATCT 
AGTACCCAGA 
AAQGATTTAA 
AG6TTTATAT 
AGTTQAATTT 
TACCOCTAOG 

GATTmrrr 

AGAGTGACTT 
TTACTTTGCT 
A0G6CACCAA 
AGCITGGATT 
CXIATGAGAAG 
ATATTGAAAA 
GCATAAXGM5 
ATGCTTTAAA 
TCATAAGTAC 
ACAGATTTTA 
IDGGATCTXTT 



21 31 

I I 
GGGAIVGOOCT AAGAGCCITA 
TGCAAGGAAG GTCACTGTAG 
TCXSACrTATT AOATGOSGCT 
TGAATTTTTT OCTCATGTGG 
TATAAIATTT GTTGCXATAC 
GCTTC3TGGGT AAAATCCTGA 
ATOCAATGCT GAATATTTOG 
TGTTGTCICA GCTTOGGCAC 
ATGCAGCAAC AATATTCMG 
CRTTCGCATT GACTTGGGAT 
ACTCTTTACIT CTCTGGAGAlG 
CCTTTATXCC TTTGTCRSAG 
TTAOUIAA'TT OCTATAGAQA 
CTOCCTAGTA TACXTTOGCftG 
GTAXAGGAGA TTTCCAGCIT 
ACTAAGTTTT TTCTTOGCTA 
GTCAGAGAGA TATTTGtXTC 
CTCTTaOAAT GAGGAAGAAB 
CCTTG6CTTA CTTTOCCTCC 
CIG6AGAGAA TTCaOTTTTA 
TTTCCATGTT TTAATTTATO 
TACACCAOCA AACTTTGTTC 
GCAGCTTTGC AGATACCCAG 



41 

I 

GTGAAACTTG 
GTGTGAjTTGG 

ATCAXOraoT 

TJtfSAiTGTCRC 
ACAGACaAAX^ 

ttgatgtgag 

CTTCATTATT 
TTCAGTTAGG 

ijaaooAcaukCA 
cctiatcatc 
GQC£3USTG6T 
at6tgattca 
TTGIGAATAA 
GTCTTCCGGC 
GGTTGGAAAC 
TGGICCATGT 
TCRACATGGC 
TTIGGASAAT 
IGGC3tGTCAC 
TTC3V6TCTAC 
GAXGGAAACQ 
TTQCTCTTGr 
ACTGA 



51 
1 

TTTACCZAAT 
AA6TOGAGAT 
GATAGGAAGT 
TCATOUIGAA 
TTATACCTGC 
CAATAAC3VTG 
CGCAGATTCT 
AUXTTAAGGAI 



ABCCM3ASM3 
GGTACCTATA 
TCCATATGCr 
AACCTTACCr 
AGCIGCIXAT 
CTGQTZACAG 
T6CCTACAQC 
TTATCRSCAG 
TGAAATGTAT 
TTCTATOCCT 
ACTTQGATAT 
JUSCITTXGAG 
TTIGOOCTCA 



Seq XD HQs 574 Protein secMeoce 
Protein Accession fts Eos sequence 



RIBQYPESNA 

RNQQSDPYKI 
CaXOLBliltSF 
X8FGIKStX3Ii 
BSJCYRFYTPP 



11 
\ 

K8I.SETCLPN 
PHWDVTHHB 
EYLASIiPPDS 
ULG&LGSARE 
PIEIVKKTLP 
FFAMVHVAYS 
IiSIaLAVTSXP 
NFVLALVLPS 



21 
I 

GXNGXKDARK 
DALTKTNIXP 
LLVIDGFNVVS 
IEHLPLHIjPT 
IVAriLIfSItV 
IiCIiVKSaiSBSL 
SVSNAmWRE 
IVZLDEJ^IfC 



31 
J 

VTVGVIGSGD 
VAXHREHYTS 
A»AIiC2LGFKD 
LWRGPWVAI 
yiAGLlAAAY 
XIiPLNMAYQQ 
V3FZQSTUSI 
RYPD 



41 

I 

FA]CeZiTXR2>X 
LnDLRKLIiVG 
ASRQVYICSK 
SLATFFFIrYS 
QIiYyQTJKYRR 
VHANXENGWd 
VAUiXSTFHV 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1030 
1080 
1140 
1200 
1260 
1320 



51 
1 

RCGYEWIGS 
KILXDVSNNM 
WrOAEQCfVIE 
PVRDVXHPYA 
FPP«LETHLQ 
EEEWIRIEMY' 
ZilYGWKRAFB 



60 
120 
180 
240 
300 
360 
420 
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&eq ID JSlOi 575 DMA sequence 

Nucleic Acid AcceBsiozi «$ HH_001873.1 

coding seQue&c«: 3.. 1721 

1 11 ai 31 41 51 

1 I ! 1 1 I 

JUUiTGGOQTa CCaarCTCTC dgccgqoccc CTGCCTCGCA GTGGTTTCTC CTGCRGCTCC 60 

CCTGGGCTOC GC3GGCCAGTA GTGCAGCCCG TGGRQCCGCG GCTTTGCCCX; TCTCXTTCTOG 120 

GTOQCCCCAB TGCQCySGQCT QACT^CTCATT CAGCCOGGGA AQGTQRSXXG A6TAGAGGCT 180 

GGTGCQGAAC TTGOOGCCCX: CAGCBGOGCC GGCtaGOCTAA GCCCAGGGCC GGGC31GACAA 240 

AAGAGGCCGC OCGCQTAGQA Aaoc/^C<^CC GGCGGCOGCG GAGOGCAGOS ATGGCOGGGC 300 

GAGGGGGC3«3 CX3CGCTGCTG GCTCTGTGCG GGQCACTGQC TGCCTGCGGG TGGCTCCTGQ 360 

GC3GCX33AAGC CCftflGABCCq! GQGGCtSCCCG CGGOSGGCAT GAGGCGQCGC CGGCGGCTQC 420 

AfiCftAGAGGA CGGCATCTCC TTCGJVGrTACC ACOGCTACCC CGAGCTQOSC SAQGCGCTa? 480 

TGTCCOTOTO GCTOCAGTac RCCGCCPl'SCA GCAGGATTTA. CAGBGIGGGS CX3C3U3CTTOS 540 

AGGGCOGGGA GCTOCIGGTC ATOGAGCTQT CaSftl^RAOCC TQGOGTCCAT GAGCCTCiaTG 600 

AOC3CTGAA.TT TAAATACATT GGGAATATGC ATGGGAATGA OQC^rGTTGG^^. OGAGAACT6C 660 

TCATTTTCTT GGCCCAGTAC CTA.TGCAACQ AATACCAGAA GQGGAACGAG ACAATTGTCA 720 

AccTGATcov cAGTAcx^GGc: attcacatca xgccttccct gaacccaoat ggctxtgaga 7 bo 

AGGCAGOCTC TCAGCCTGGr GRACTCAAQG ACTQGTTTGT GGGTGGAASC AATGOCCAOG B40 

GAATAGATCT QAAOCQOAAC TTTCCftGACC TGGATAGGAT AOTGTAOQTG AAllGAGAAAG 900 

AAGGTGGTCC AAATAATCAT CTGTTOAAAA ATATC3A2VGAA AATT6TGGAT CAAAACACR& 960 

AQCTTQCrCC TGAGAOCAAS GCTGTCATTC ATTGGATTAT QQATATTCCT TTTGTGCTTT 102 0 

CTGCCAATCT CCATGGAGGIA GACCTTGTGG CCAATTATOC ATATOATOAQ ACGCOGAGTG 10 &0 

GTASTGCTCA CGhKTACMSC TOCTCCCCAG ATGACGCCAT TSTXCCAAAGC TTGGCCOGTO 1140 

CATACTCTTC TTTCAACCOG GCCATGTCTG ACCCCAATOG GCCACCATGT CGCAAGAATQ 1200 

KSGMXShCMS CAGCTTTGTA GATGGAACCA CCAAC3GGTOG TGCITGGTAC AGGOTACCTG 1260 

QAfaQOATGCA AGACTTCAAT TACCTTAGCA GCAACTGTTT TGAGATCACC GTGGAGCTTA 1320 

GCTGTGAGAA G7TCOCACCT GAAOAOACTC OXSAAGACCTA CTGOGAOGAT AACAAAAACI 1380 

CCCTCATTAG CXACerrGAG CAGATACACC GAOQAGITAA ABGA-mGTC GGAGACCTTC 144D 

AAGGTAACCC AATTGOGAAT QCCACCATCT OCStn^GAAOG AATAiGACCnC GATIGITACAT ISOO 

CCGCAAAGGA TGGTGATTAC: TOGAGATTGC TTATACCTGO AAACIATAAA CTTACftGCCT 1560 

CAGCTCCAGG CTATCTGGCA" ATAACAAAQA AAGTGGCAGT TCCTTACAQC CCTGCTGCTG 1620 

GGGTIQATTT TGAACTGGAG TC31TTTTCIG AAAGQAAAOA AQASGAGAAG GAAGAATIOA 16BQ 

TGGAATGCTG OAAAATGATG TCAOAAACXT TAAATTTTTA AAAM3BCXTC TABTTASCrEG 1740 

CTTTAAATCr ATCTATATAA TGTAGTATGA TaTAATGTOG TCTTTTTTTT AGATTTTGTG 1800 

CASTTAATAC TTAACRTTGA TTTATTTTTT AATCATTTAA ATATTAATCA ACTTECCmA 1860 

AAATAARTTbG CCICTTAGGT A2\AAATATAA GAACTTGATA TATTTCATTC TCITATATAG 1&20 

TATTCAT7TT CCTACCCATA TTACACAAAA AAGTATAGAA AAGATTTAAQ TAATTTTGCC 1980 

ATCX:TAGGCr TAAATGCAAT ATXCCTG6XA TTATTTACAA TGC2US3\MTT TTTmSTAAT 204Q 

TCTAGCTTTC AAAAATTAGT GAAGTTCTTT TTVCIGTAATT GGTGACAATI3 TCACATAATG 2100 

AATGCXATOXS AAAAGGTTAA CSiGATACAGC TC5GGRQTTaT GaGCACTCTA CTGCAAGACT 2 ISO 

TAAATASTTC AGTATAAATT OTCGTTTTTT TCrTGTGCTG ACTAACTATA AQCSWrfiATCT 2220 

TGXIAATGCA TTTTTQATGG GAAGAAAAG6 ^^ACAICnTA CAAAGAGGTT TTATGAAAAS 2280 

AATAAftAATT OUZITCTTGC TTGTACATAT AGSkGC3^TA CaKETATATT ATGTASTGCB 2340 

TTAACACTAC TIAAAAGflTT AGGGTTTTCT CrTGGTTGXA Q&OTGGCXXA GAIWrCGCATT 240Q 
CTGAATGAAT AAAGGTT7UVA AAAAAATCCC CSkGTGAAIUUl AAA 

Seq ID KOx 576 Protein aeguence 
Protein Accession #: MP 001864.1 



I 

mOEUSOSAJA 
BALV8VNLQC 
RBIiIrlPIAQr 
MAQGIDUJHW 
FVXtSLAXfUEGG 
KKNDDD8SFV 



11 
I 

AliCSAZAAiOQ 
TAISEtXyrVG 
LCMEyQXGHE 
FEDLDRIVW 
ZXLVANYPYDB 
DGTIHGGAinr 



I^TASAPGXIA ITKKVAVPyS 



21 31 41 51 

1)11 
HLLGABAQBP GASAAOffiffiR fiRLQQEOOIS VBSWOCSBm 
SSEBGRELIiV IBLSEMPGV2 EFGEPEFJCVI GNMQQjlBAVQ 
TZVEBEiIHSTR ZBXMPfiLNPD 6FBKAASQFO £CKDHFVGRS 
NEKBGGPN23H LZiKNHKKIVD QlJTKLAPETK AVIRNIMDIP 
TRSaSKES^G eSPDDAIFQS XARAYS9FNF AMSDFHBPPC 
8VPGGMQDF1T ini95HCF8IT VELSCEKFPP EETXiXTXVED 
SDLQBHPIAN ATZSVEGIDH DVTSAKDGmr tOOiLIPGIjlYK 
FAAGODFELB SFSSKKEEEK EELHEmVKBIH SBSEJtP 



60 
120 
160 
240 
300 
3€0 
420 



Seq JD IitDt 577 DMA sequmCfe 
KUcleic Acid Acceasion. i^: Eos se^qize 
Cbding aeqiience; 1..933 



1 
1 

ATXTTGCAiaCA 
TTOGACAAGA 
ITGOCCIGTG 
GACTGTCCCG 
QCCCX3CrflCC 
AATAACT6TC 
GGGCAGGXGT 
ATCATOGGCA 
CACCAGCGGA 
CTGCTGTOCC 
AATAAIGGCA 
OCACCCTCCT 
GQGCCCfAGT 

GACAOCAGOC 



11 
) 

ATGOAOOaTG 
GTSASGAGAA 



ATGGCAGOSA 
ACIGCAACIAA 
AAGACAACAG 
XtGIGACXTC 

AGOGGAACAA 
GOCTGGTGGT 
TCGhOTAXQT 
ACTCCXSU3GC 



GTGOCftACAG 
ACAGCCX9GGG 



21 
I 

GATCCGGGGC 
GGAGTGOC3CC 
CCATTGCATC 
TGAAGAOAAC 
OQGCCTCTGT 
TGATQAGGAA 
AGiBGft ACCAA 
TTTTGTGCrCG 
CCTCATGACG 
CCTGOACCAC 
GGCCAGC3CAG 
dTGCTGGI^ 
aSAATCTCTG 
TSCCAGCXC9C 
GQUSCCTGGC 



31 
I 

GCCTOGCAGT 
AAG8CXAAGT 
ATTGOTGBCT 
TGCACAGCAA 
ATTGAC^^AGA 
AGCTCTGAAA 
CElXS'iXdTA'iT 
GTGOTGGGOC 
CTGOOCBTGC 
CCOCACCACT 
GOGGAGCAGA 
CAGAOGCCIG 
AACCAAGCCB 
CAGGC3UX:CA 
CQCCAGC3Af36 



41 
I 

GTGAOGOGCT 

CGAAarGrGG 

TCCGC3V6CAA 
ACCCTCTGCT 
GCTTCATCTQ 
OTTCTCAAGA 
ACCCCAGCAT 
TGCTGGOiCr 
ACJCSGGCTGCA 
QCAACGTCAC 
ATGOOTCGGA 
GGTGGTATGA 
AQCIGCCOOC 
GCAGCCTCCr 
GCACTGCTGA 



51 
I 

GCCXGACIGC 



TOQGTTTG2U3 
TTGCTCCROC 
OaATGOAjCAB 
AOCOGGCnST 
CACCTATGCC 
OGTCTTGCAC 

gcaocctgtg 
ctacaacotc 
agtaggcicc 

OCTTC3CTCCA 
CTACOGCTCC 

gagcotggaa 
gcocagggac 



60 
120 
180 
240 
300 
360 
420 
480 
540 
GOO 
660 
720 
780 
840 
900 



1174 



wo 03/042661 



PCTATS02/36810 



TCTGAGCCCA. OCCAGGGCAC TG&AGAACTA 

Seq XD UO: 578 Protein sequence 
^ Protein AccesBloa fts Eos seqoeiice 

a 13. 21 31 41 51 

I I I I t I 

HCSNGRCIPG ANQCDGLPDC FDRSDIEKECF KAK8K0GFTF FPCASGIHCI IGRFKCNGFE 60 

DCPDQSDBEH CTAHPLIiCST A&YHCKHGIiC ISKSFIC33GQ JONCXiDNSEIEE SCBSSQBPGS 120 

10 GQVFVTBEKQ LVYYPSITYA XIGSSVIFVL WALlALVm HQHKHMNUTr LPVHRI.QHPV 180 

LXiSRIiWIiDH PHEqiSIVTYNV NKQXQYVASQ ASQNASEVGS PPSYSEALLD QRPAHYDLPP 240 

PPYSSDTSSL KQADIiPPYSS RS6SAII8AS8 QAA8SI>LSVE DTSKSPOQPG PQEGTAEPIID 300 
SEPgQOTBZV 

15 Seq ID 190; 575 DWA seguence 

Nucleic 2^d Accession #t AF179274.I 
Coding sequence a 1 > . 1125 

1 11 21 31 41 51 

20 ! I 1 I 1 1 

ATGGTGCTGT GGGAGTCCCC GOGGCAOTGC AGCAitSCTGGA CSUTrTTGCGA GGGCTTTTGC GO 

IGGClGCrCSC ^TGCOTGCCCX^T CATGCTACTC AT0GTA6C0C GCCCOGTOAA GCTCGCTGCX 120 

TTCCCIAGCI CCITftAaiGA CTOGCMUkCQ COCACOOOCT 66AATT6CTC TGOTTATGAT IBD 

GACAGAGAAA ATGATCTCTT OCTCTGIGAC AOCAACACCT OTMWrrCGA TGGGGAAT6T 240 

z5 TrA2U3AArrrs lanisACACTGT GACrrGoexc tgtcagttca aotgcaacaa toactatgto 300 

CCTGTGTGTG GCTCCAATG6 GGAGAGCTAC CAGAATGAGT GTTACCTGCG ACAGGCTGCA 3 SO 

TGCAAACAGC AGAfilGftGAT ACTTGTGGTG TCAGAAGGAT CATGTGCCAC AGATGCAGGA 420 

TCAGOATCT? QAQATgO&GIT CCATOAAGGC TCXGSAGAAA CTAGTCAAAA GGAGACATCC 480 

AOCTGTGATA TTTGOCAGTT TG6T6CAGIAA IGTanCQAAG AT6CGGA0GA TGTCTGGTST 540 

30 GTOTOTAATA TTGACTGTTC TC3UUVCCAAC TTCAATOCCC TCTGCGCTTC TQATGOOAAA €00 

TCTTATGATA ATGCRTGOCa RATCAAA(3AA GCMCOTQTC AGAAACWKA GAAAAXTGAA 660 

GTCATGTCTT TGGQTCOATI} TCAA13ATAAC ACAACTA£:AA CIACTAAGTC TGAAGATGGO 72 0 

CKTTXSGCML GAACAGATTA TGCAGAGAAT GCTAACftAAT TACUU&GAAAG TGCCAGAGAA • 780 

CACCACATAC CTTOTQOOGn. ACATTACAAT GGCTTCTGCA TGCATGGGAA QTGTaAGCAT 840 

35 TCTATCAATA TGCAGGAGCC ATCTTGCRGG TOTOATGCTS GTrATACIGG ACARCRCTGT 900 

GAAAAAAAOS ACTACAQTQT TCTATACCSTT GTTCCCGGTC CT6TACX3ATT TCAOTATOTC 960 

TTAATOGCMS CTGTGATTGG AACAATTCAG ATTQCTGTCS^ TCTGTGTGGT GGTCCTCTGC 1020 

ATCACAAGOA AATGCCCCAS AnC3CAACM»k ATTCACAGAC AGAAGCAAAA TACAOt^QCAC lOBO 
TACIU3TTGA6 &CAATACAAC AAGTUSCGTCC ACQAGOTTAA TCIGA 



40 



55 
60 
65 
70 
75 
80 



Seq ID NO: 580 Proteiji Beqwence 
Protein AcccBsion i NP 057276.2 



1 11 21 31 41 51 

45 ! I I I ! I 

MVXJME9P11QC SSHTLCBOFC HI^LI>PVMLL ZVAKPVKLAA FFTSLSDOQT PTGHMCS{?YI] GO 

DRBNDIiFLCD TNTCKFDGEC IiRlGDTVTCV OQPKCNNDYV PVCQfiKQESy QtSTBCYLRQAA 120 

CKKXiSEXZiW SEGSCKIDAS SQSOElSUHBa eGatfiQXBTS TCDIGQFGAE CDEDASBEVHC IBD 

VCHZDCSQIN FHPLCASDGK SYESIACQIKE ASOQKQEKXB "VMSUSRCQBlSt TTTTIKSSDG 240 

DU HYARTDYA&I? AMXIiBBSASB BHIPCFfiHUff GPOWSKCBR SZKMQBPSCR Ca»GXTQQElC 300 

EKKDYSVLYV VPGFVRFQYV I^IAATZCTIQ lAVICVWIiC rCRKCCRGHR IHSCHKSKTGH 360 



seq ID 110$ 581 DHA sequence 
Hucleio Acid AceeBfiion #] S7B203.1 
Coding eeqaesice: 

1 11 21 31 41 51 

I I I I 1 1 

ATOAATCCTT TCCAGAAAAA TGAGTCCAAO QAAACICTTT TTXCACCTGT CTOCATTGAA 60 

6M3GTACCAC CTGGAOChCC TA6GCCTCCA AAGAAGCCAT CTCCGACAAT CIGTOGCTCC 120 

AACTATCCAC TOAaCATTGC CTTCATTOTQ GTQARTtS&AT TCTGOGAGOG CTTTTCCTAT 180 

TATGGRATGA AAGCTGTGCT GATCCTGTAT TTOCTGTATT TCCTGCACTG GRATG&AfiAT 240 

ACCTCCACAT CTAIZkTAC^CA TGCXysTVCJUSC AGCCTCTGTT ATTTTACTCC CATCCXGGQA 3 00 

GCAGOCATTO CTGACTCST6 GTTGGGAAAA TTCAAOACAA TCATCTATCT CTOCTTGGTG 3 60 

TAIGTGCTTG GCCATGTGAT CAAGTCCTTG GGTGGCTTAC CRATACTQGQ AGQACAAGTG 420 

GTACAC3VCAG TCCIATCATT OATOGQCqTO AGTCTAAXAG CTTTGGGGAC AGGAGGCATC 480 

AAACCCTGTG TGGCAGCTTT TGGTGGAGAC CA6TTTGAAG AAAAACATOC AOASOAACGa 540 

ACrAGATACT TCTCAGTCTT CTAOCTGIOC ATCRATGCAO GGBGOTTGAT TTCTACATTT 600 

ATCACACOCA rtGCia&QAGG AGATGTGCAA 7GTTTTGGAG AAGACTGCTA TGCATTOGCT 660 

TTTQGftCXTC CABGACTGCT CATGGTAATT GCACTTOnO TfiTTTGCAAT GGGAAGCAAA 720 

ATATACAMA AACCAdCCCCC TGAAGGAAaC ATAGTGGCTC AAGTTTTCAA ATOTATCTGG 780 

TlTGCTATTX C3CAAT0GTTT CAAGAAGGGT TCTGGAOACA TTCCAAAGCX3 ACAGCACTGG 840 

CTAGACTGGG CAGCTOAGAA ArATCCAAAG CAGCXCATTA TOGAT6TAAA OGCACTGAGC 900 

AGGGTACIAT TOCIITATAT GCCATTGOCX: ATCrTTCTGOQ CTCTTT^TGGA TCAGCAGGGT 960 

TCAOGATGGA CITTGCAAQC CATCAGGATG AATAGGAATT TGGGGTTTTT TOTQCTTCAG 1020 

CCG6ACX3U3A TGCAG6TTCT AAATOCCTTT CTGGTTCTTA TC3TCATCCC GrteXTTGAC 1080 

TlTGTCATTT ATOQTCTGGT CTCXAAGTGT OGAATTAACT TCTCATCACr TA0GAAAATQ 1140 

GCTGTTGGTA TGATOCTAGC GTGOCTGGCA TTTGC2\GTT6 aSGCPtGCXGV AGAGATAAAA 1200 

ATAAATGAAA TGGOCCC2U3C CCAXSaCMSGT CCGCAGGAOQ TTTTOCTACA AGTCTTGAAT 1260 

CIGGCSUSATG ATGAGGTQAA GGTGACAGIO GIOGGAAATG AAAACMTTC TCXGTTGATA 1320 

GAGTOCATCA AA3K!CTTTCA GAAAACaUXIA CACTATTCX3L AACXOCAQCT GAAA2UMAA 1380 

AfiCXaUSGATT i:TCACTTCX3i CdQAAATAT CACAATTTGT CTCTCTACAC TGAGCATTCT 1440 

GlGCAGGflOA ASftACTGfSTA CSWSTCTTSrC ATTCOTGAIUS ATGGGAACAG TKTCtCChBC 150O 

1175 



wo 03/042661 



FCT/US02/36S10 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



ATGATOGTAA 
AACACTTTGG 
fiRAGACTATa 
TGTAOAACAG 
TATCTGnrO 
ATTOCRGCCA 
GGOGAGGTC3^ 
ATOAAATCTG 
CTTGTTGTGG 
CTCCTGCTGQ 
ACAOnOOATA 
AAACTAGAiQA 



AGGATACAGA 

ataaagatgt 

GTGTGTCTGC 

aaqataagaa 
ttaitactaa 
acaaaatgtc 
tgttctctqt 

tgciccaggc 

CACAQTTCAG 
TGATCTGOCX 
TGCGGOarCC 
CCAAGAAGAC 



AACTCAAAACA 
CAACATCICC 
TTATAGAACT 
CTTTTCTCTG 
TAACACCAAT 
CATTGCGTGG 
CAGAaOTCTT 
AGCTTOC5CTA 
TGGCCTGGTA 
GATCTXCTCC 
AGCAGATAAG 
AAAACTCTQA 



ACCAATGGGA 
CTOAGTACAD 
GTGCAAAGAG 
AATTTGGOTC 
CAGGGTCTTC 
CAGCTACCAC 
GAGTTTTCTT 
TTQACAATTG 
C2VGXOGGCCX? 
ATCATGQGCT 
CACATTCCTC 



Se<i ID NOs 582 Protein, sequence 
Protein Accession #i AAa343B8-l 



1 

I 

MNPFQBNSSK 
YGKKAVIilIiY 
YVLGHVJKSL 
TRYFSVPYIiS 
lYNKPPPBQH 
feVLFIfYIPLP 
PVITOt.VSKC 
LfiDDEVKVTV 
VQEKNWYfiLV 
BDYGVSAYRT 
IPANKMSIAH 
LWAQPeOLV 
KLBTKE^TKL 



11 
I 

ETLFSPVBIE 
FLYFLHHNED 
GALPILOOaV 
IKftE39CiX8TF 
IVAQVPKCIW 
MFMAL£J)QQG 
QINFSSLRKM 
V6NIEHH9IiZjZ 

VQRGEYPAVH 
QLPQYALVTA 
QnASFXX>F&C 



21 
I 

EVPPRPPgpp 
TSTSrVHAFS 
VSTVL8I>IGL 
ITFMLRGDVQ 
PA19NRFKNR 
SRWTLQAIfiH 
AVGMXIiACEjA 
B8IKSFQKIP 
mVKDTBSKT 
CRTEDKNPSL 
GSVMFBV76L 

IiIOaVICZiIFS 



31 

I 

ZKPSPTICfiS 
SliCYFTPILG 
SLIALOTQBZ 
CPOEDCVAXA 
S6DIPKRQHH 

FAVAAAVSIIC 



TNGHTTVRFV 

NIiGtiLDP(3AA 
EFSYSQAPSS 
IMGYYYVPVK 



TGACARCCGT 
ATAOCTCTCT 
GAGAATRCCC 
TTCTAGACTT 
AGGOCTGGAA 
AATATGCCCT 
ATXCTCAIGGC 
CAGTTGGGAA 
AATTCATTTT 
ACTACTATGT 
ACATCCAGGG 



41 

I 

HYPLSIAFIV 
AAlADSWIiGK 
KPCVAAFGGD 
F6VP6LLHVI 
ZiDHAAEKYPX 
PDQMQVLKPP 
USEHAPAQSG 
SQDFBFBLKy 

TTLFVITNNTN 
NKSVLQAAWL 
TSDKROFADK 



GAGGTTTGTT 
CAATGTTGGT 
TGCAGTGCAC 
TGGXGOIOCA 
GATTGAAGAC 
GGrTACAGC]* 
TCCCTCTAGC 
TATCATCGTQ 
CTTTTCCTGC 
TCCTGTAAAG 
GAACATGKTC 



51 



VKKFCERFSY 
FKTIIYLSIiV 



Seq ID WO J sas DWA sequence 

Nucleic Acid Accession A» NM_032642.1 

Coding sequeucei 184.. 1263 ~ 



1 
1 

GRCCATTAGC 
TAGtCTTGAAC 
ACAGAGGGAA 
ACCATGCCXZA 
CTQACAGAOG 
TTTfiTCATOQ 
AAGCTGTGGC 
ATCAAGOAAT 
OCATCTGTCT 
GIGAGCGCOG 
ACCTQCGGCT 
GGCTGTGGGG 
GAGCGABAGA 
CAAAACAACG 
CAOSGCBTCI 



AM3G6GCSGC 
9:ATaTGGACC 
C3UGGGCOGCC 
GQQOBTQGCT 
TGGTGCTSCT 

xnscxxxaoAG 

ATAAATCTAT 
GAAAGAtCGAA 
GACTTTGCTG 



GhGGfiAGGTT 
QTTAGAGGAC 
ACrCAQCTTC 
ABOCCrGGTG 



mSATCTGCA 
GGTTCACXAO 
OCCTAAACIG 
GAOQTTATAC 
CTCCAGCGAG 
GGC3QT6CTCA 
TGGAAAAAAA 



11 
I 

AGGCACCCAG 
CTAGGAACTG 
CCCTACTCXiQ 
OCCTGCTGCT 
CCAACTCCTG 
<3TGOOCZ«5CC 
AATTOTACCA 
GCCAGCACO^ 
TTGOOAGflGT 
CXSGGCX3TGGT 
GCAGCC23QAC 
ACAAOGTGGA 
A6AACTTT(?C 
AGGCCGGTOG 
03GGGTCX3G 
ACCGGCTGAA 
TGC3AGCXGST 
CCAG0CCC3QA 
TCTOCAACAA 
AC3\AOCAGXT 
TCGTCAGGI6 
GGOCTGCrCC 
ATAAATCTAT 
AATGGAAA6G 
GITCTCTCCT 
AOQATGTAGG 
GTGGTIQGAT 
TacCTGTGAT 

aacctcgatg 
ctctcttact 

CTTCCCTAAA 
CCrC0C3GGAC 
CTC erACC TG 
AATQTTTGCG 

GGaOGOCTCA 
TCATCTCTGC 
AAAAAGAAAA 



21 
I 

GOCTGTCTTT 
CAQOaCCAGA 
GAAACTGTCA 
(aCTOTTCACG 
GTGGTCATTA 
CSTQTGCAGT 
GGAGCACSLTG 
OTTCOGGCAG 
CATGCAGA*EA 

CAAcsosm: 

GGC3G0GGCCC 
GTAQGaCIAG 
CAAAGGATGA 
CAGGSCTQTO 
CAGCCTCAAS 
GGAGAAGTAC 
CAACAGCOBC 
CTACTGCCTG 
GAOCXCGGAG 
CAAGAGCGTG 
TAAGAAeiQC 
OGSCCOCCOC 
TTXA*CATTTG 
AAflAGCTTAT 
CTTQGTGGffT 
GACTTGQAAA 
GGAQGAGATG 
CCIGGCCACT 
TCTTCAGGGT 
CTITCATOCA 
ATGAGAAGTC 



A2U3AGGAAA6 
CCAQGGCTGC 
CACCCX:CGGG 
CAAACCAC34G 
COCAGGTGTA 
AA21AAAAAAA 



31 
] 

GGCTCOGAAA 
GAGATTCCAC 
GTOCCAOGGC 
GCTGCTCTGC 
GCTTTBAACC 
CAGCTTCCCG 
□CCIACATA6 
CCSGOGeXGQA 
GGCAGOOGAG 
AGOCGGGCXrr 
AAGGAOCTGC 
CGCTT06OCA 
GhflSAQCAGQ 
TATAMSATGG 
AJOCTGCTtSQC: 
GACaOOGCGG 
TTCA£3CCAGC 

OQCAAaaaoiA 

GGCATGGKIO 

AOGG2U3ATCG 
TGCACTCTOC 
TRTAAGTAAA 
TTAAGAGAOG 
GGGAOAGAG8 
TATTTACTGT 
ATCTTQTCTt3 
AOC3CCAA6A6 
CTTGTCCSftiGA 
OBTGCACrTG 
CAAGGTCATC 
OCTTTOC2VGC 
GGGGCCATTT 
AOAAGOCAGG 
GAOGGGAAGC 
GAOGCTGCAA 
OGGTTTCTCr 
AA 



41 
I 

CGGTGGCCCC 
IGGAGGCrOA 
ACIGGGGftSG 
T6TCXZAC3CTG 
OGGTGCAGAS 
GGCTCTCGCC 
GGGAGGGAGC 
ATTOCAiGCAC 
ASACGGCCl'I 
GCCSOQAGGG 
COOGGGACTG 
AGGAiSTTTGT 
GCOQGCTGCT 
GftGACGTAGC 
TGCRGCTGGC 
CCQCCATGCG 
CCAOCCXZGGA 
GCACGGGCTC 
GCSSTQAGCT 
OCTGOCACVG 
VGOACCAOTA 
CTCACAAAGG 
TGGGTGGGTS 
CTGQAGATCT 
GCTTTTTCTC 
CT6TCC3UXA 
GAAGTCTAGA 
GCCCIAIOAA 
ATOTAdCSATGS 
TGCGGCATCT 
TCIGGOCCAG 
OftfJAATTCTT 
GACGIGACArr 
GTGCATGAOC 
TTGAGCTGCT 
CGGGTCZAGGC 
CTGACATTAA 



AltWFAMQSK 
QtlMDVKALT 
LVLIPIPLFD 
PQEVFI.QVLN 

IjSTDT9UIVG 
QGLQAHKIED 
L*rlAV<»IIIV 
HIPHZQGHHI 



51 
1 

CAA-TGTAQCC 
TGGAOQGGTG 
GCrGAf3GCC3B 
GGCTCAGCTT 
A£XX?6AfiATO 
TGGCCAGAGG 
CAAGACTOQC 
AGOGGACAAC 
CACGCACXK3Q 
CGAGCTCTCC 
GCTGTGGQGC 
GGATGOCOGG 
CMTGAACCIO 
CTGCAAA'XGC 
OGAGTTCOGC 
CQTCAOCOGC 
GGACCTGGTC 

ociaaacAoa 

CATGTGCTGC 
CAAjGTTCCAC 
CATCTGTAAA 
TCa^AXA'TTAT 
CTATACAATG 
CTGAGGBGTG 
TCOCTCTGGC 
OOGCCTGGAO 
GICTTTGTT6 



GTTCOGXAAG 
GOVGTTTACa 

tgaccacaga 
catgctccac 

GTCAGGAAftS 
AGG<=K30GTG 
GCTGTCACTC 
TG6CX3GGCCC 
ATGCCCTTCA 



1560 
1620 
1680 
1740 

laoo 

1860 
1920 
1980 
2040 
2100 
2160 



60 
120 

lao 

240 
300 
360 
420 
480 
540 

eoo 

660 
720 



60 
120 
180 
240 
300 
360 
420 
400 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2L00 
2160 
2220 



Se^ ID 290 i 584 Protein eeguence 
Protein AccesBion #s NP^116031.1 

X 11 21 31 41 S3. 

I 1 1 I I I 

HPSLLLLFTA AIiLSSHAQbZi TDAnSWHSLA LNPVQRPSMF IIGftQFVCSQ I>FGI«BPGaBX 
LGQLYQBIMA YIGBGAKTOX ECBQQBQFHQR RHNCSlfADMA 6VFGRVMQXG SR&TAFTHAV 



60 
120 



1176 



wo 03/042661 



3AAGVVNAIS RACREOBLST CGC9RTARPK DL.PRDWI,WGG CODNVjeYGVH FAXBFVIKARE 180 

REECNPAKGSB EQaRVIJ«KILQ NNEAGRRAVY KMADVACKCH GVSGSCSLKT QKUShKEFBK 240 

VQDKUKEKTO SAAAMRVTRK G£U<HI>VHSRF TQPTPEDLVY VDPSSDYCLR NBSTGSDGTQ 300 
GRLCaiKTSBG KDGCBIMQCG RGYNQFKSVQ VERCHC3CSHW CCPVRCKKCT BlVDQrXCK 

Seq ID WD J S85 DNA ggqueoce 

Nucleic Acid AccasslDn #s Eos sequence 

Coding sec^ence: 1.-1479 

1 11 21 31 41 51 

^ • < » J 

ATGQCTTTGA ACTCAGCKTC ACC^CCAGCT ATTGGACCTT ACTATGAAAA CCZATOGATAC 60 

CAACGGGAAA ACCCCTATCx: C3I3CACAGCCC ACTGTGGTCC C3CACTGTCTA CGAG6TGCAT 120 

cjcsQGcrcAGr actaccx:gtc ooccgtgccx: cagtaogccc ogagggtcct gacxjcaggct lao 

TCCAACCCCG TOGTCTGCAC QCAGCCCAAA TCCCCATCCQ OOACAGTSTG CAGCTCftAflQ 240 

ACTAAGAAAG CACTQTQCAT CACCTTQACC CZGQQGAGCT TCCTCGTGGG AGCTGCGCTG 300 

GCC3GC!rGGCC TACTCTGGAA GTTCATGGGC AGGAAGIGCT GCAACTCTGG GATMSAGTQC 360 

GACTCCTCAS GTACCroCRT CftACCCCTCT AACTGGTGTG ATOGCGTGTC ACACTOCCCC 420 

GGOGGQGAGG ACX33VGAATCG GTQTOTTCGC CTCTACGGAC CAAACTTCAT CCTTCAGGltt 480 

TACTCATCrC AOAGfSAAGTC CTGGCACCCT OTGTGCCA2U3 AGGACTGGAA C3AGAACTAC 540 

GGGCGQC5CGG CCTGCAGGGA CATQGSCTAT AAGAATAATT TTTACTCTRG CCAAGGAATA 600 

QTOOATGACA GCGSATCCAC CAGCrTTATG 2UUU:TIBAACA CRMSVQCCQQ C3UVTGTCGAT 660 

ATCTATAAAA AACTGTACCA CAGTORTQCC TGTTCTTCM. AA<3CRQTGGT TTCTTTACac 720 

TGflAXnSCCT GCGGGGTCAA CTTGAACTCA AflCCXSCCaaA GCAGGATOGT GGGCGGCGAG IQO 

AQOGOGCTCC CGOGGGOCTG GCCCTOGCAG GTCAGCCTGC AOGTCCftSAA CGXCCAOGT6 B40 

TGOQQAOaCT CKATCAKSu: CCCOSftGTGG ATCXSTQACAQ C!CGCCCACTG GGrGSAAAAA 900 

CCTCTTAACA ATCCATGGCA TTG6ACGGCA TTTGOGGGGA TTTTGAGUMai. ATCTTTCATG 960 

TTCTATGGRG COGGATACS^A AGTAGAAAAA GTOATOTCTC ATCCAAATTA TGACTOCftaC3 1020 

ACCaUGAACA ATGACATTGC GCTGATGAAS CTO(3W3AAGC CTCTGACTTT CM^EOACCTA lOaO 

OTGAAACCAG TGT(?rCTGCC C&ACCCACJGC ATGATGCTGC itfSOZAGAACA GCTCTQCrGG H40 

AlTTCCQGCTr GGGGGGCCRC CGAOGAGAAA GGGAftSAOCr CAGAAGTGCT GIACOCCiaCC 1200 

AAGGTGCTXC TCATTGAGAC ACAGABATQC AACAGCAGAT ArrGTCTATGA CAACCTGAXC 1260 

ACACCAQCCA TGATCTOTOC CGGCTTCCTG GAQGaQAACG TOGATTCTTG CCnOGGTGAC 1320 

A6TGGAGGGC CTCTGGTCAC TTOQAAGAAC AATATCTGGT GGCTOATAOa GGATACAAGC 13 BO 

TGGGQTTCTG GCTGTGCCAA AGCTTACAGA CCAGOflGTGT AOSGGAAIGT GATGQTATTC 1440 
ACGGACTGGA TTTATCGACA AATGAGGOCA (SAOGGCTAA 

Seq ID HO: 586 Protein eegugncg 
Prot«izi Accasalon #t Eos sequence 

1 11 21 31 41 SI 

1 I 1 I I [ 

MALklSGSPPA XGPYySMEOy QPENPYPAQP TWPTWEVH PAQYYFSFVP QYAPRVIiTQA 60 

StanrVCXQtK SPSGTVCTSK TKKALCITLT LGrPLVOAAL AAGLIMKEMG SKCSNSGIBC 120 

D&SSTCZNPS NWCDOVSECP GGEDEKRCVR I*YGENPILQV YSSQRKSHHP VCQDDnKBHY 180 

GHAACRDMGr raNFYSSQGl VDDSGSTSFM KLKTSAaCaiVD lYKKDYHSDA GBSKAWSIK 240 

CIACGVHLHS SaQSHIVGGE SAIiPGAWl^ VSIiHVQBIVHV OGGGIITPfiH XVXAAHCVEK 300 

ELHKPHHMXA FAGILRQSBM FYOftGXQVBK VISHPNY23SK TKHKDIAUflK LQKPLTBSDb 360 

VKPVCLEMPG HMLQP&QliO? ISGHGATESK GKTSBVLHAA 3CVLLIHTQRC KSaYVYEHIir 420 

TPAHXCAGFL QQNVDSC5QGD SGGPLVTSKW JilWWLIGDrS HGSQCRKAYR PGVYGNVMVP 480 
TDHIYSQHRA DG 

Seq ID 290] 587 DMA Bequencc 

Hncleic Acia Acnession #s Em_00565£.l 

C3oaiQg sequences S7..1535 

3- IX 21 31 41 51 

L, ' 1 I I I 

CSTCATATTGA ACATTOCAGA TACCTATCAT TACTC3GATGG TGTTQATAAC AGCAAGATGG 60 

CTTIGAACTC AGGGTCACCA CCAGCTATTG GftCCTTACTA TGAAAACCAT GGATACCAAC 120 

06GAAAAC0C CTATCCOGCA CRGC3CCACTS TOQTOCCCAC TGTCTAOSAa QTGCATCOGG IBO 

CTCRGTACTA CCOGTCOC3CC GTaC3CCCAOT AOGCCOCXSAG GGTCCTGAOS CAGQCTTCCA 240 

AOCCCBTOSr CTGCAOQCAG CCCftAATOCC CATCaaGOAC AGTGTGCAOC TCAAAGACTA 300 

AGAAAGCAd OTQCATCACC VXGACCCTGO GGACCITOCT CGTOGQASaCT GCQCTGGCCG 360 

CTGGCCXACr CTGGAAGTTC ATGGGCAGCA AGTGCTCCAA CTCTGGGATA GA/STGCQACT 420 

CCTCftGGTAC CTGCATCAAC -CCCXCIAACT GGTGTGATGG CGTGTCAIZAC TGCCCCGGCG 480 

GQGAGGACCA GRATCGGTGT QTTCGCCTCT ACGGACCAAA CTTCATCCTT CAGATOTACT 540 

CarCTCAQAB GAZ^TCSCTGG CACCCTGrGT 0CCAAGAOGA CTGGAACQAG AACTACGGGC 600 

OGSOSOCera CAGGGACKFG GGCTATAAGA AXAATTTTTA CTCIAGCCAA GOAATAGTGG 660 

ATGACAG0G6 ATCXZACCAGC TTTATGAAIU: TGAACACAAG TOODGGCAAT GTOGAIATCT 720 

AXAAAAAACT QTAOCACAjgr GATGCCTGTT CTTC:aaaagC AGTGGTTrCT TTACaCTGTT 780 

TAfiq!CXaCX3G GGTCAACTTG AACTCAAGCJC GOCAGftGCAG GftTOGlX3GeC GGTGAGAGCG 840 

OGC3PCCCGQQ GGCCTGGCCC TOGCAGGTCA GGCTOCAQGT CCAGAAOSIC CACQTOTCOG 900 

GluaGcrCCAT CATCACC90CC GAGTGGATCXS TGACAGCGGC CCTVCTGOGTG GAAAAACCTC 960 

TTAACAATOC ATGGCATTGG AGG6CATTTG OSGGOATTTT QAGACAATCT TICATOTTCT 1020 

ATGGAGCC6G ATACCAAQTA CAAAAAGTGA TTTCTCATCC AAATTATGAC TCXAAGACCA lOHO 

AGAACRATGA CATTGCGCTG ATGAAGCTac AGAAGCCTCT OACTTTCAAC GACCrAOTQA 1140 

AACCAGTGTG TCTGCCCAAC CCSUSGCATGA TGCIGCASCC AGAACRl3CrC TGCTGGATTT 1200 

CnagOTGOGG GGCCACCCSAG OnGAAAGGQA AGACCTCAGA ASTGCTGAAC GCtGCCZlAGG 1260 

T0CTTCTCAT TGAGAC^CAG AGATGCAACA GCAGATATGT CTATGACAAC CTGATCACAC 132 0 

CAGCGftTGAT CTQTGCOGGC TTCCTOCAGa GGAACGXCGA TTCTTOCCAQ GGTGACSiGTG 1380 

GAGCaC3CTCT GGTCACTTOS AACAACAATA TCTOGTGGCT GATAGGGGAT ACAAGCPGGG 1440 

GTTCTGGCTG TGCCAAAGCT TACftfiACCRG GhfiTSTAOSG GAAT6T6RTG GTATTCRCQQ 1500 

ACXGGATTTA TCOACAAATG AAGGCAAACG GCTAATCCAC ATaOTCTTOG TCCTTQAOGT 1560 
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CSTTTTACAA. GAAAACAATG GGGCTQGTTT TGCTTCOCOG TOCATGATTT ACTCTTAGAG 1«20 

ATGRTTCAGA QQTCACTTCA TTTTTATTAA ACASTGAACT TGTCTGOCTT TGGCACTCTC IGSO 

TGOCATACTG TGCAGGCTGC AGTOGCTCCC CTGCCCAGCC TGCTCTCCCT AACCCCTTGT 1740 

CCGCAAGGGG TGATGGCCGG CrGGTTGTHa GCACTGGCX5G TCAATTGTGG AAGGAAGAGG 1800 

GTTGGAGaCT GCOCOCATTG AtSKICTTCCr GCTGAGTCCT TTCCAGSOGC CftATTTTGGA I860 

TGAGCATQGA GCTGTCACTT CTGAGCTGCT GQATGACTTG A621TGAAAAA QGAGAGACAT 1920 

GGAAAOGGAG ACAGCCAGGT OOCACXTGCA GCGGCTGCCC TCTGGGGCCA CTTGGTAGTG X980 

TOCCCAGCOT ACTTCACAAQ GGGATTTTGC TGATGOGTTC TTAGAGCCTT AGCAQCCCTG 2040 

GATGGTC3GCC AGAAATAAAG GGACCAQCCC TTCATGQGTG GTOACOrGGT AGTCACTTGT 2100 

AAOGGGAACA GAAA<::ATTTT TGrTCTTATG GGGTGAQAAT ATZUSACAGTG CCCTTGQTGC 2160 

GAQGGAAGCA ATTGAAAAGG AACTTGCSCCT GaGCACTCCT GGTGCAGGTC TCCACCIGCA 2220 

CATTGGGTGG GGCTCCTGOG AGGGAGACTC AGCCTTOCTC CTCATGCTCC CTGAOCCTGC 2280 

rCCTAJSCACC CTGGAGAGTG AATGCCCCTT GSTCCCTGGC AGGGCGCKM GTTTGGCACC 2340 

ATGTXaSGCCr CTTCaGGCCT GATAiGTCATT GGARATTQAQ QTCCATGGGG GAAATC3UVGa 2400 

ATQCPCAGTT TAAGGTACAC TGTTTCCArG TTAT6XTTCT ACACATTQM! GGTGGTGACC 2460 
C1X3AGTTCAA A6CCATCTT 

6eq ID taOs 5Bfl Protein sequence 
Protein Accession #: EIP_005647.1 

1 XI 21 31 41 51 

1 I I I I 1 

KALN19QSPPA laPYVSNUOY QPEN^XPAQP TWPTVYBVH PAQVYPSPVP QYAPRVLTQA 60 

SNPWCTQPK SPSGTVCTSK TKKALCIIM' LaTFLVGAAL AAGMiWKFMG SKCSNSOIBC 120 

DSSOTCniPS UHCDGVSBCP GGBDENRCVH DYGStaFlUaH Y5SQRXSHHP VOQDDWNSNY 180 

GRAACEtDMGY KHNFITSSQGX VODSGSTSTH KUaTSAGRVD IYXKE.-SH9DA CSSKAWSLA 240 

CSiACQVHUHS SRQfiRlVGGB fiALPGAHPWQ VStUaVQaVEV CGG6ZITPEW TVTAAHCVEK 300 

PUanPHKHTA FAGIttHQSFM FYGAGYOVQK VI8HFNYDSK TKHKDIAI^ LQKPLTFMDL 360 

V£PVCLPl!lPG KKCiQPEQLCH ISGWGATEEK GKTSEVliflAA KVI>I«IBTQRC NSSYVyDMIiI 420 

TPAMrCAQFI. QGNVDSCQGD SGGPLVTSNN NXHWIiHGDTS HGSGCAKAYR PGVYGNVMVF 48Q 
TDHXXSlQHKA N6 

Geq ID NO; 569 DMA aectuence 
NUdeic Acid Accession 41 s HM__00193S.I 
Coding saguencei 1..2301 

1 11 21 31 41 51 

\ \ 1-1 i I 

ATGAAGACAC OGTGGAAGAT TCTTCTGOGA CTGCTQ6GTG CTGCTGCGCT TGTC3«^2ATC 60 

ATCAEZCSBIGC COGTGGrTCT GCTGfkACAAA G6CACAGATG ATOCXACAGC TGACAGTGGC 120 

AAAAGTTACA. CTCIAftCTGK. TTACTTAAAA AATACXTATA GACTGAAGTT ATACTCCTTA 180 

AGATOOATTT CAGATCATBA ATATCCCXRC AAACAAGAAA ATAATATCTT GGTATTCAAT 240 

GCTGAATAT6 GAAACAGCTC AGTrTTCTTQ GRQAACAGTA CATTTGATOA GTTTQQRCAT 300 

TCTATCAATG ATTATTCAAT ATCICCTGAT GGGCAGTTTA TTCTCTT2«3A ATACAACTAC 3«0 

GIGAAGCJUVr GGAGGCATTC CTACACAGCT TCAXATOACA ^mATGATTT AAATAAAAGG 420 

CAGCTGiKTTA. CAWUGAOAa OATTCCAAAC AACACAC9VGT GGGTCSUZATG GTCACCASTG 480 

GGTCRIAAAT TGQCATATGT TTGGAACAAT aA£ATTTA<re TTAAAATTGA ACXMATTTA 540 

CCAAOTTACA GAATCACATG GAOt3GQG3^ GRAC5ATATAA TATATAAItSG AATAACTGAC 600 

TGGGTTTATG AAGAGGAAGT CTTCRaTQCC TACTCOlGCTC TGTGGrTGOTC TCCAAACGGC 660 

ACTTTTTTAG CATATOCCCA AXTTAACGAC ACAQAATTCC CaCTS^ASTTGA AIACTCCTTC 720 

rACTCTGATG AGTCACTaCA GTACGCAAAG ACTGIAGGGG TXCCATATCC AAAGGGZUESGA 7 BO 

GCTGTGAATC CAACIGTAAA GTTCTTTGTT GTAAATACZAG ACrCTGTCAG CTCAGTC^U^C B40 

AATQCAACTT CCATACAAAT CACIGCZGCT GCTTCTATGT TGATAGOOGA TCACTACTTG 900 

TGTGATGTGA CATGGQCAAC ACAAGAAAGA ATTTCTTTGC AGTGGCTCAG GAGGATTC3U3 960 

AACTATTGGG TCATCaGAXAX ttaiGIGACIAT GATGAATCCA GTOGAAGAIG GAACTGCTTA X020 

GXGGGAOGGC AAChCKTTGA AAXGAlGTACr ACfGGCTGGO TTGQAAQATT 13K3QCCTTCA lOBO 

GAAQCTCAIT TTACOCXTQA TGGKMVTAGC TTCTACAAGA TCSVTOUSCAA TGAAGAAOGT 1140 

TACnOACACA TTTGCTATTT CCAAATAI3AT AAAAAAGllCr GCACATXTAT TAC3UUIAGQC 1200 

ACCrr(3GX32Uy3 TCKTOGGGAT AGAAGCTCTA ACCAGTOATT ATCTATACTA CRTTAGTAAT 12fi0 

GAATATAAnG GiVATGCCA!(3G AGGAAGGAAT CTTTATAAAA TCCAACTTAG TOACIATACA 1320 

AAAGTGACAT GCCIC3USTTG TGAGCTOAAT COS^UUUSGT CTCAGTftCrA TTCTGT(3TCA 13 BO 

TTCAGTAAAO AGGCQAAGTA TTAVCZUSCT6 A8KIGTTGO0 OTCCTQGrCT GCCCCTCTAT 1440 

ACTCTACACA GCSUBOGTGAA TQATAAAOSQ CItSAGAGTOC TGGAAGACAA TTUAGCTTT6 1500 

GATAAAATGC IGCAGAATGT CCAGAT60CC TCCAAAAAAC TOGACTXCAT TATTTTGAAT 15G0 

0AA3U:AAAAT TTTOGTATCA GATOATCTTG OCrOCIGATT TTGATAAATC CAAGAAATAT 1620 

OCXCZACTAT TAGAIGieTA TGCAOGCOCA 3GXAGTCAAA AAGCftGACIU: IGTCTTCAGA 16B0 

CTQAACTGGQ CCACTTACCT TGOUkGOVCZA GAAAACATTA TAGTACECmG CTTTQATGGC 1740 

AGAG6AA6TG GTTACCAAGG AGATAAQATC ATGCAUGCAA TCAACAGAAG ACTGGGAACA 1800 

TTTGAAGITG AAGATCAAAT TGAAJ3CAGCC AGACRATTTT CAAAAATQGG ATTTGTGeAC 1860 

AACAAACGAA tTQCSlATFTG GGGC3X36ICA TATGGAGGGT AOGTAAC3CTC AATGGTCCTG 1920 

GGATOGGGAA GTGGOSIGTT CAA0ICTGQA Alr&eCCGTGG OSGCTGTATC OCGOTOGGAS 1980 

rEACTATGACT CAGTGTACAC: AGAZUQGTTAC ATOGGTCTOC CAACTCCAGA AGACAACCTT 2040 

GACCATTACA GAAATTCAAC AiGTCATGAGC AGRGCTGAAA ATTTTAAACA AOTTOAGTAC 2100 

CTCCTTATTC ATQOAACRGC AGATGATAAC GITCACrrTC AGCAGTCRGC TCAXSATCTOC 2160 

AAAGCOCTGG TOGAIGTrGG AGIGGAXTTC CAGGCAATGX GGTATACTGA TGAAOAC^CAT 2220 

GGAATAGCIA GCAGCS^CAGC ACA0C3UICAT ATATATACOC AC3UrGAGC3CA CITCATAAAA 2280 
CiAATGnPTTCr CTTTAOCTTA G 

seq XD NO: 590 Protein secmence 
Protein Accession #: sip_0Q1926.i 

1 11 21 31 41 51 

1 I 1 I i I 

HKTPnKIIiLa LXJ3AAAZ«VTI ITVPWIiUSK GTIIDATADSR XTmODYIiK UTYRUKZaYSL 60 
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RWISDHEYI-y KQEt3Mll.VFBJ AEYGISrSSVFL BNSTFDEPQH 5TNDYSISFD <3QFlUiEyNY 120 

VKQWHHSYTA STDZmSSTKR QLITBERIPN NTQWVTWSPV GHKLftYVTWNN SZYVKIEFin» 180 

PSYRITHTGK EDTIYNGITD WVYEEEVFSA YSALWWSEKG TFLAYAQFHD TEWPIiTEYSF 240 

YSDESLQYPK TVRVPYPKAG AVNPTVKFFV VNTDSL8SVT NATSIQITAP ASMLIGDEYIi 300 

5 CDVTWATQER ISLQWUIRIQ WYSVMDICDY DBSSOHHNCX VARQHIEHST TtCWVORFRPS 360 

EPHETIaDGNS FYKIISMEBG YKHICYFQrD KiCDCTFlTKG THBVIGIEAL TSDYIjYYISS 420 

EYKGHPGGRN I^YKIQIiSDYT KVTCLSCBLtl PERCQYYSVS VSSEPiXI/YQIi RCSGPQLPLY 480 

TUISSVIIDKS IfRVLEXnilSiUi DKHLQNVOHP SKKLDFriUT BTKFHYiQMIIi PPHFDKSKKT 54 0 

PliLDVWASP C8QKADTVFR LNHATYIAST ENIIVASPDG RGSGYQGDKI MHAINRRLGT 600 

10 PEVEDQIEAA RQFSKMGFVD NKRIA1W6WS yGaYVTSMTL GSG8GVFKC3B lAVAPVBRWB 660 

YYDSVYTBRY MQLPTPEDNIj DHYEMSTVMS HAKNFKQVKY LLIHGTADDM VHFQQSAQrS 720 

IMVEVGVDP qamwytdedh giasstahqh jythmshfik qcfsi*p 

Seq ID NO; 591 PNA sequence 
15 nucleic Add Accession ff: 1«_0 160 77.1 
Oodixig B^quoncex 128.. 667 

1 11 21 31 41 51 

I 1 1 I I 1 

ZU Ta3CTTTGTG ATTCTTGATG CXSGAACTTTQ TCACXXawSGA ACCCCGGAAQ AGGTAiGCTCA 60 

CGCGATAGAR AGGTGTTCGC TTGCCCRGAA GAAGGGAAGG C3QCGAGTGAG GAAAGGAGOT 120 

ACTGTAGA7Q CXXITCCAAAT OCTTQSTTAT OCSAATATTTG GCTCATCCCA 6TACACT06G 180 

CTlQGCIGTT GGASTTGCTT QTGGCAT6T6 CCTGGGCrGQ AQCCTTCGAG TATGCTTTGQ 240 

QATGCTCCCC AAAAGCAAlGA GSAGCAAQAC ACACACAGAT ACTGAAAtSTG AAGCAAGCAT 300 

25 CTTGGGAGAC AjGOQQGaasr ACAAGATGAT TCTTOTGOTT C3SAAATGAC7 TAAAQArrGOG 360 

A7UUU3GQAAA GTGGCIGCCC AGTGCTCTCA TGCTOCTGTT TCaGCCIAGA AGCAGATTCA 420 

AAGAAGAAAX DCTGAAATGC TCAAACAATG GQAATACTGT GGCCAGCCCfi. AGGTGGTGGT 480 

CAAASCTOCT GKTGAAGAAA CCXMATTGC ATTATTGOOC CATGCAAAAA TGCT^GQACT S40 

GACTOZAACZr TIAATTCAAG ATGCTGGACQ TACTCAGATT GCACCAfMCT CTCAAACTGT 600 

30 CCTAGGGATT GGGCCAGOAC CAGCAGACCT AATTGACAAA GTCACTGGTC ACCTAAAACT 660 

TTACTRCGTG GACTTTGATA TGACAACAAC COCTCCATCA CRftaTGTTTG AAiGOCTGTCA 720 

GATTCTAACA ACAAAAGCTG AATTTCTTCA COCAACTTAA ATGTTCTTGA GATOAAAATA 780 
AAAC3CTATTC CCATGnCTA AAAAAA 

35 Seg IB HOs 592 Protein seaucnce 
Protein Acceeelon <tx npjD57X6I.1 

1 11 21 31 41 51 

An. ^ I 1 I 1 I 

40 MPflKSI»VK5T lAHPSTUGLA VGVAOGMCLG WSUiVCFQMD PRSKTSKEHT DTBSEASILG 60 

DSGCYICMIIiV 'VSKDLKMGKG KVAAQCSHAA VGAYKQIQRR NPiHIiKiQKBY GGQPKWV3CA 120 
PDEBTIilAtJj AHAKMLQIiTV SI.1QDAGRTQ lAPGSQTVUG IGPGPADLXD iroTOHIaKia' 

Seg ID MO: 593 DMA cequence 
45 HlicLeic Acid Accession tts F6GNBSB predicted 
Oodingr sequence: 1..3.89fi 

1 11 21 31 41 51 

I i 1 ^ " ' 

DU ATGOaOGCXXa TOCCGCTQCCJ OaCCCCXSCTC CTGCCGCMC TQCTGCTOGC GCTCCTGQCC 60 

GCTGCGGGOG 0CGGO6CCAG CAQAGCXSSAG l^COGTCTCCO WSOG/XGQCC GGAAGCXTQAa 120 

CQCaAGTCGC GGCCACGGCrC CGGCCC6GC30 CCCGQGAACA GC3«XX3GOTT TQQGTCTOQG IBO 

GOGGOGGGOG GCAGCGC3CA6 CTCCAfSCTCC AACAGCAGTG GQBAOGCCTT GGTOACCOBC 240 

_ ATTTCCRTCC TCCTCC3GCGA CCTACCCACC CrCAAfiGCAG CCGTGATCOT GGOGTTOGCC 300 

55 TTTACCAOCC TCCTCATCGC CTGGCTGCTC CXGOGOQTCT TCSiGGIGGGG AAACaAeOTTA 360 

AASAAGACAC 6CAAGTATGA TATCATCACC ACXGCAGCauS ABCSOACnDGQA AlCCGGGGCCA 420 

CTAAATGAA6 AGGATGATQA AGATGAOGAC TCCACAGTAT TOSACATCAA ATACaQAOTG 4B0 

TCCXTGC03G CTGCACTGAG ACGTCAPCTQ CCAGGGTGOC AGAOSCTACT GACAGTTCXZT 540 

^ _ GTGOCCCCAC CCTTCMCCT CGACATTGAC CTTCCaaCAA GATGCAGTGG AAOaCCrGAT 600 

60 GGTGGAATCA GACCTGGTAA AACClGTrrC OCftGOCTSCST GGCAXCXTIGT GGAAA6T3K30 660 

TCAQCTQCAA CCTGGGGTGT GAAGGACTOS ACCIOGAAGC OCTCXTGCQT OOBMSGIGIT 730 

GAAACCAAAA OGAACGTTAT GIATAAAAOC CCAGCTCC3«r CGXGGGTGTC AGGCATCTQC 760 

TCRGACTGTC ACTGGCAAGC TCGTTTCCAC GICAGCACAA TOTAOTTGCT TCTGOCAGCC 840 

TCTTGGGCATC CSCTTTAAAQT GCOCCCTACT TCTACTC3CCC ATGGTTTTOG ACAACTGCAG 900 

65 CIGAATCrCA TGGAAAAGCT GQATTCCTCT GOCTTAOQCA GAAACAOCOS G GCTCC ATCT 960 

GCXAGGTOCT TGCCACTCGt OCTGGCaGAA AieGGSGCIG CIGMUUSTGA GCTTtX3JAr 1020 

CCTTGGIGGC ACT'TCAiGCGC CACIUSGCTCT GCAATAAAAA COCITiaCAC 1080 

AQTACCTTGO QCTTGGATGi; TTTCTGTGOT GCX^GGCCAGC GGGGCACCTT TTOTGAASAC 1140 

AGAOCAGTGA C£AAGGTTCT CCAGGOTAGC VCTTTCTOCA AACAGCTGCG CTGGAAGCCA 13 OO 

70 GCGCTAGAGA GTGGGTTTCC CCATCATCIC AGGCTTCTCA QAGAGXGTCC TCOOCIGAOC 1260 

ACOCATCCT6 TCSVGGTTGGC TGC3TXCA6AT GCOOOGGGAC AAGCCAJ9C3CT GAGGGGGAGG 1320 

AGOGTGTTTC GGCOTCOGOG QCRCrtCCCTG CATGOOOGAfi GGTOUSOGGG TACXSSCAACT 1380 

TGCCITTTGG TTTTGAAGAT TCrGTTOfl<3G C3GCCATC3CTC ACCTTOACCr CXTCTACAAA 1440 

ATCTC3TCTCX; CCTGCTGTGC OGTGGAACAC CTAOGQQAAfi OCAAGAGAAG CTCAGTGACI 1500 

75 GrCCnXSCGl* CATTTGAGCA QAGCCCACAA AAGGCAGCTG CTGCCCAOGG GGAGOCTGTC 1560 

AAAOSAGGGC GCAjGTGGGCA A!XlGAOCftGA CACACATGGC CIGGCIGGGG QATCACACKT 1620 

GCXSAACCIGC AGACAA7TGC AGATACC3CAA GGCCAGGAAG QCGCAaST^ GGATGTCACr 16B0 

CACCCTGGAG GAGACTTGGA TOGGGTGGCA AATTTCTATT TOGAGGAAGA QOGTTTCCAG 1740 

_ _ GATQGCAGAT GCCAGAAlSAT GOTCCTGATG TCTOAGQAAQ GGCCACCTAG TTTGACAOGA 1800 

80 TOTOnQAGGC TCSU^AGGTTC CCATCACTTC TCCASCCATT CCAASTCTTG GTCCTTCCTT 1860 
TOCOaCX»AC AfiCCCCTGTT TClGTCCAGG OGCTQ& 

Seq ID EEDs 594 Protein aequence 
Pcoteln Accession #£ EQENSSB predicted 
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1 Ll 21 31 41 51 

I I I I 1 ! 

HSJWELPftEIi liPLLUAl^LA ASAARASEAB SVSAFHPBPE RESRPPFQPG PGNTTRFGSQ 60 

MU3GSf3S3SS NSSGDAiVTE lSIIiLM>LPT LKAAVIVAFA FTTIiLIACLL LHVFRSGKaii 1.20 

KKTRKYtlllT TPAERVEMJiP UTBEDUKDED STVPDIinrRV SLPAALRHQli PaCQTLLTVP IBO 

VPPPPItDID LPARCSGRPD GGlfiPGlCrCF PAWWHPVESM SAATHGVKDH IWKPSCVGGV 240 

BTSCniVKYKT PAPSCVSGIC eDCHWQARFR VTTMSLLLPP FOIPFK7PPT STPH6FRQLQ 3 00 

XnUIEKLDSS ALIiRSITEtAPS ARCIiPIirVIAE MTVAAESDI^FH P^4WHFSATQS FIKTLYTQTM 3^0 

Sn^UDVFCG AGQRGTFCED RAVTKVLQGS SF&KOLRHKF AI..FSGPPHHL RLLSECPPIiS 420 

THPVRIiAR£D ARQQASLTGR RVFjEUlPRQSIi EGGGSA6TAT CI^X-VUCIXiIA RHPHLDLFYK 480 

ICLPCCAVEH LRKAKRfiSVT VIASFSQfiPQ KAAAAHQEFV KRJGPSGQLTR HTCPOWGITH 540 

AHLOTIPDTQ GQBSPREDVT HPGGDLDGVA HFYLEESGFO DGRCQKMVUf SEBGPPSLT6 6O0 
CERIiTGSHHF GSBSR&UGFIi SBtlQPltPLSR P 

Seq ID siOs S95 DNA seofuence 

Nucleic Acid AccesHlon fie MH_Q2iei4.1 

Coding eeqaence: 1,,L740 



1 II 21 31 41 51 

) I I I I I 

ATGAGCAGCT GCMSeTACAA CXSGGGGOSTC ATGCGGOCSC TCAGCAACXT GAGCCSCSTCC 60 

CGOG6GAACC TGCACQAQAT GGACrTCAQAG GOQCAGCCCC TGCAGGCCCC CGCX3TCTGTC 120 

GQAGQAGOTG GCX5GCQOGTC CTCCCCGTCT GCAGCGGCTG CGGCOGCCQC CGCTGTTTCG ISO 

TCCTCAGCCC COGASATCQT aOTGTCTAAa CCCGftGCACA. ACAACTQC&A CAACCTGGCG 240 

erCTATGQAA COGGCGGOGG AGGCAGCACT GGAGGAGGCG GOGGCGGTGG G0G6A30G8S 300 

CACGGCAGCA GCAGTOGCAjC CAAGTCCAiGC AAAAAlQAAAA AC3C3^SAA.CA.T CQQCTACAAa 360 

CTGGGCCAOC GQCGCHCCCT 6TTCGAAAAG OGCAAGCGGC TCAGCGACTA CBCGCTCATC 420 

TTCGGCATGT TCGGCATCGT GGTCATGGTC ATOGAfiAOGG AGCTGXCGTG GGGCJGCCTAC 4B0 

GACAAGGOGT CGCTGTATTC CTTAaCTCTG AAATaCCTTA TCAGTCTCTC CACOATCATC 540 

CTQCrOiQTC TGATCATOGT GTACCACGCC AGGGAAATAC AGTTGTTCAT GGTGGACAAT 60O 

GGAGCAGATG ACTOGAQAAT AGCCATQACT TAT<3RS0QTA TTTTCTTCAT CTGCT7GGAA 660 

ATACTGGTGT GTGCTATTCA TCCCATACCT GGGAATTArrA CanTCACAiTG GA£3GX50C5C3GO 720 

CTTCCCITCT CCTATQCCCC ATCCACRACC ACCQCTGA7G TGGAa'ATTAT TTTATCTATA 780 

CCMTGTTCT TAAGACTCTA TCTGATTGCC AfiAGTC3«rGC TTTTACATAiS CSUUiCTTOTC 040 

ACTGATGCCT C^CTAOAAG CATT03AGC!A CTTAATAAOA TAAACTTC3UV TACAOGTTTT 900 

flTTATGAAGA CTTTAATGAC TATATGCCCA GGAACTGTAC TCrTGGTTTT TA6IATCTCA 9G0 

TIA7GGAXAA TTGCCZGCATO GJUZTQTCCGA □CTTGTIQAAA GG^TAOCSVTQA TCAACAG6AT 1020 

OTTACTAGCA ACTTCCTTGG AGGGAT6TGG TTGATATCAA TAACTTTTCT CTCCATT6GT 10 BO 

4-U TATGQT6ACA TCGGTACCTAA CACATACTGT GGAAAAOGAG TCTtSCTTACT TACTGGAATT 1140 

ATOOGTOCrG GTTGCACftiSC CCTGGTQGTA GCTGTAGTGG CAAGGAAGCT AGAACTTAiCx:: 12DD 

AAAGCAGAAA AACAOBTaCIA CAATTTCATO ATGOATACTC AGCTGACTAA AAGAGTAAAA 1360 

AATQC34QCTQ OCAATGTACT CAGGSAAACA TGGCrAATTT ACAAAAATAC AAAGCTAGTG 1320 

AAAAAGATAG ATCATGCAAA AQTAAOAAAA C31TCAAC3QAA AATTCCTGCA AGCTATTCAT 1380 

CAATTAAGAA GTGTAAAAAT e&AGCAGAGG AAACTGAATQ A£3C:AAQCAAA CACTTTGOTG 1440 

GAGTIX3GCAA AGACCCAGAA CATCATGTAT GATATGATTT CT6ACTTAAA OSAAAGGAGT 1500 

GAAOACTTOG AGAAGAGGAT TGTTACMXIXG GAAACAAJ\AC TAGAQACTTT GATTGGTAOC 15€0 

ATCCAGGCCC TCGCIQGGCT CATAASCCAS ACCATCAGGC AGCAGCAGAG AGATITCATT 1620 

GAGQCTCAGA TGGAGAGCTA CGACAAGCAC GTCACXTACA ATGCTGAGCG GTCCCGGTCC 16BO 
TOGTCCAGGA GGCeGC3QGTC CTCTTCCACA GCACCACCAA CTTCATCAQA GAGTAGCTAG 



Seq ID NO: 596 Protein Bcqnttnce 
ProHein AcccBsion HP_067627.I 



1 11 21 31 41 51 

I 1 I t I I 

KSSCETTNOOV MRPI.SBn.SAS KBISIjHBMDSE AQPLQPPASV GQQGGASSPS AAAAAAAAVS 60 

8SAFBZW5K PBBM»£Ba3LA LYG^ITGGGGST CSGGGGmsas &BSSSSTKfiS EUOOaQNlGyK 120 

LGBSKAIiFBK RKRIiSDYALI VQKffQXVnW lETELSHGAY DKASIjYSIiAL KCLISIiSTlI 180 

LLOLZIVmA RBIQLPKVDI7 GADDffRIAMT YBtaPPICLB XLVCAIHPIP QSltVPTtfTRR 240 

liAFSYAFSTT TADVDZZLBI PMFUlLYIiIA RVHLLHSKZiF TDASSRSZGA XiNKIUFNTRF 300 

VMKTlM^rCP <3TVI.riVFSIS LWIIAAWTVR ACERyHDQQD VTSHFIlGA^^W LISITFLSIG 360 

ZIQOHVPOITYC GKGVCLLTGI MSiV9Cl!A2>W AWHRKEiELT KASKHVHNFH KDTQUrKRVK 420 

fiOAAiivui&r KUYBHTKiiv xKiEraucviuc msaa^irnxM oiksvkmbqr ia£nx2Aim»v 400 

OJ tniAXTQiNZKr SMISDU9ER8 EDFEXRIVTI. BTKLETLIGfi IHAI.PGZ3I&Q TIRQC2QRDFI 540 
HAQHESYDKH VTYNAERGRS SStOtSBSSST APPTS5E3S 



Seq IS NO: 597 DPTA seqaence 

BRiclelc Acid AaoeBHlon #s ljlH_016029-l 

Cbding Bequcaicei 228.. 1097 

1 11 21 31 41 51 

I I I 1 i 1 

CIGCGAXCCC GCAGOGCAGC GAOGGGACTC TGSTGCGGGC C6TCTTCITC CCGOOGAGCT 60 

GGGOGTG06C OOGCGCAATG AACTGGGAGC TQCTGCTGTS GCTaCTGGTG CTGTQCQCGC 120 

TCCTCCXGCT CTTGaiGCAG CTGCTGCGCT TCCTGAGGGC TGAOSGOSAC CTGAOSCTAC 180 

TATGGGOCGA GTGGCAGGGA GOACaCOCAG AATGGGAGCT GACTGATAXa OTQOTGTGGG 240 

IGACTGOAGC CTOGAd&raQA ATTG6XGAGS AGCTGGCTTA CCAGTrGTCT AAACTAGGAG 300 

TTTCTCITGT GCTGTCAGCC ASAAiOAaiaC ATGAGClGGA AAtSGSTGAAA AGAAGATOCC 360 

TAGAGAATG6 CAATTTAAAA 6AAAAASATA TACrTGTTTT GCCCCTTGAC CT6AC0GACA 420 

CTGGTTCCCA TGAAGCQGCT ACXSRAftGCTG TTCTCCAGGA GTTTGSTAGA ATCGACATTC 4Q0 

TGGTCAACAA TGGTGGAATQ TOCCAGOGTT CTCTaTGCAT 6GATACCAGC TTGGATGTCT 540 

ACAOAAAGCT AATAGAGCTT AACEACTTAG GGSAOGGTGTC CTIGACAAAA TGTGITCTGC 600 

CTCACATGAT C6AGA1GGAA1& CAAGGAAAGBk TTGrCACrGT GAATAQCATC CTGGGCATCA 660 
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TATCT6IACC TCTTTC3CATT GGATACTGTS CTAGCAAGCA Tl^CTCTCCGG GGTrTTTTTA 720 
ATGGCCTTOe AACAGAACTT GCCaCATACC CAGQTATAAT AGTTTCTAAC ATTTGCCCAG 780 

OAOCTGXGCA atcaaatatt gtggagaatt ccctagctgq rqaagtcaca argactatag a40 

GCAATAATGG AaA<3CAGTCC CACAAGATOA CAACCAGTCG TTGTGTGCGG CTGATtSTTAA 900 

TCAGCATGGC CAAT6ATTT6 AAAGAAGTTT OQATCTCAaA ACAAECTTTC TTGTTAGTAA 960 

CATATTTGTG GChATAC&TG GCAACCTOaa GCTGeTGGAT AAOCftACAAG AraGGGAASA 1020 

AAAGGATTGA GAACTTTAAG AGTG6TGT0G ATGCAGRCTC TTCTTATTTT AAAATCTTTA 10 BO 

AGACAAAACA TGACTGAAAA GAGCAOCTGT ACTtTTTCAAS CCACTGGRGG QAQAAATGGA 1140 

AARCA.TGAAA ACWSCAATCT TCTTATGCTT CTQAATAATC AAAGACTAAT TTGTGATTTT 12 OO 

ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAAT6A AATAAAAAAT AAATAATAAA 1260 
AGATIGCCAT 6AATCTTOCA AA 

Seq ID HQs 598 Protein sequence 
Pzoteln Accession 1IP_0S7213.1 

1 11 21 31 41 51 

I 1 I I I ) 

KtmShLhmJi VIiCALI»M.I.V QLLRFLRADG DLTLLWAEWQ GRHPEWELTD MWWVTGASS fiO 

GZGE&LAYQIj SKLGVSLVLS AKRVBELmiV KSRCLEKGITI. KEKDXLVLPL DLTZ>TaSHBA 120 

ATKA.VLQEF6 HIDIIiViDJGG MSQRBLCMDT SLDVYRKLIE lBrVI.GTVSLT KCVliPHMIEE. 180 

SQQKJVTWS ZLGIISVPLS IQYCASKHAIj RGPEISK3LRTE LATYIGIIVS NICPGFVQSW 240 

IVSnSLAGBV TKTIBHNGDQ 8HKMXTSRCV RLMLISMAND XJCEVNISBQF FLLVTYLWQY 300 
MPTHAHWITN KMGKKRIB»F KSGVDAD3OT PKIFKTJCHD 

Seq ID S0» 599 DWA sequence 

Nucleic Acid Acceaalon ft: HM_000793.2 

Coding s^quencea 401., 1222 



i 11 21 31 41 51 

30 ! 1 I I I I 

GQCTGCAGA6 ASAaaCACTT TGCACCACAG ACAGATAGCA AI3AABGGRAA GACAGAGAGT 60 
GAGAAAAAAG ACGAGTCAGI OSCrCCTGGG GAAQGGA6AG AQTGAGACTG GQAGAAAGAG 120 
AAGCACAGAA AGTGTOTGTA AAACGGAGTA AAGAAAGAAA AAAAAAAAAC TACCCXTAAA IBO 
GCACA.TTTAA AAAAAAAAAA CTCTGQCAAT TCAAGAAAGA AACAGGCTAG GTTTAAAGAG 240 
CATAGAGftCA ATQAAAGGCT AAAGAAAATT TTAAAATCTC TOCCACROTC TCATAGGTGC 300 
TTGGAAATGA AAGTAGAACT GCCTGTCTTT AA06GACTCT GACAGAGAGO GTGAAGGG6A 360 
ACCAGAGOSC ACAA13GGAAC TGACTCAGGA GGCAGAGAAG ATGGGCATOC TCAGGGTAQA 420 
CTTGCTGATC ACACTGCAAA TTCT0CXZAOT TTTTTTCTCC AACTOOCTCT TCCTGGCTCT 480 
Af\ CTATGACTCG GTCATTCTGC TC7^A£3CAOGT GGVGCTGCTO TTQA(30CBCT CCAAGTCXllAC 540 
4U TCQOGOAOAG TCGOGGOGCA a»3CTGACqrrc AGAGGGACTG CGCXGCGt£Cr GQAAGAGCTT 600 
CCTCCIGGAT GGCZA^3UIC aOGTGAAATT OGGTGA6GAT GCCXTCCAATT OCJVSTGrGGT €GQ 
GCATGTCTCC A6TACAGAAG CSAGOTGACAA CAGTGGCAAT GGTACGCAGG AQAAGATAGC 720 
TQAGGGAGCC ACATOQCACC TTCTTGACTT TGCCftGCCCT GAGC3SCOCAC TAGTGSTCaUk 780 
CTTTGOCTCA GCCACTTGAC OTCCTTTCAC GAGCCAGCTG CCAGCCTTtX: GCAAACTGGT 840 
OGAAGA6ITC TCCTCAGTGG CCGHCTTCCT dCTOaTCTAC ATTGATGAGG CTCATCCATC 500 
AQATOaCTGG GOGATAQCQQ GSOACrCCTC TTTGTCTTTtT GAGGTOAAGA AGCAOCSUiAA 9£0 

CCAGGAAGAa* CSATC7XGCA6 GAGOCCftGCA QcrrCTGOAG COTTTCTOCT TGCDCGCCOCa X03O 

GTQCCQAOTT GTQQCTGAOC GCATGGACAA TAAOGCCAAC AlAGCTTAOG GGGTAGCCTT lOflO 

TGAAOGTGTG TGCATTGTGC AGA6ACAGAA AATTQCTTAT CTGOGAGGAA AGGGOCCCXT 1140 

CIQCTACAAC CITCAAGAAG TCOBOCATTQ GCTGGAGAAG AAXTTCAGCA AGAGATGAAA 1200 

BAAAACTAGA TSaiGCTGGTT AAA9STATGA TTATAAGAQA QCTTATT6IT TTAAAAAGTT 1260 

ATAXAAAGGC AAGGAAATTA AC3AACTGAAT OCATATTTCA ACAGAGCGCT ATTGGCTTAC 1320 

^TGAAAGACAG GAGT^TATCT ATGGGAAGAA CATGAATCIC TAACAGCTOC ATACTTCm 1380 

CACXACTCAA ATGGCATTGG GCTGAGTAAQ TAACCATATC AOCTCTCTTC TTAGTAAAAA 1440 

GCCCI3VTGTO AAAAGATCCC AAOATGQAGA G6AAGAAAC6 CTAATTCAGC ATGTGTTCAT ISDO 

TCTOCATICA GAAaOAACTG ATACAITCZGA TOCAaWJTTi' GAGACCAGAA GAAA2\GACTT 1560 

ACCTQAATAA TTACIACATT AGGGAAGCTA CTGTC2TACOT TAAQATAAAG GGTATTGCCT l€20 

TGGcrcraarr tggcatggat GGAGCCCAGr toqaaaattc ccaaatatta caacs^agtcc iceo 

TrQRA£X3CAG GCCATGITOGT TAOACJaTTGG TGTTAAGGTT AOAiCCTTATG TTAGAGTCAT 1740 

TTCXGATGTT CiCAGCTTCTA QCCKEGTMSC GCETCXCAGTC TTCATACXXC A6AAATXATT 1000 

GSTWnLTTTa TAGATACCGA GfUUTQAXCCC TCAGTCIGAG AGGTT2\GAAT GATCATCTGT 1&60 

iWTCIGAGGG TTAATTTCTA G6CA6GTGGA GAGAGTGGTA AAAAAQAAAT GAAATTGACA 192D 

AGCTAGGAAA QABGAGGCAG AAAGAXTTGG AAAATTCT^ GAGTTTCACC CTTAAOCrGT 1980 

AGAGASrGGG TCAOVrrrGT TAGCCAOSGA AACATAGAAA CAZACAC3UiG GCCAGAAAAA 2040 

OAAGAAGGAG CrC3UkCTAAA AGTGGCATAG A6AATACACA TATAAAAACA ATATATTTGX 2100 

CHXAT13CTCC TAGaGAiSQRia AAAGC3GGTI3A TTGAAAGAAA AAAAAATACT TAAATATTTG 2160 

TAA7TGTGA6 OGGTTTCTTT TGGAAATAAT ^ACTTTTGAA CCATGTATGT GGZAIGTATA 2220 

TriT<5flGTOG GTTAATTATA OCOTkTQATA CCTATTAAAG GAAAAlXaOT GGGTCTGGTG 2280 

STGCIGGTCT TTTCCTOCCC ATTCCTACAA TTTCTATGT6 GCOC3iAfil?CA TTCCTAATCT 2340 

TQaa?CTCXAT aSCAGTOTTC TCTCTGAATG CXGftGCriGAA OAAATTATAC GTACRTACftC 2400 

ACAXhCATAC ATACATACAA ATASAIGSAT ATATATTCTC ASC£QCSGCG GGAGGTAOGT 2460 

ACXATGGOCA TTCAGCACAG CCITGATTTC CTGCCAAAGT AGG7GACK7EA TAGTGAAGAA 2520 

TAGGTGCAAA CAAACAAGCT TACTTCCATT OCAAAATAGA AGAAGAGGAA GrTAGAGATA 2580 

ATTCTGATCA ATCATTTTt3G AaSCTTTGTT ATAAGGCAAC COCCC3GTATA TCATGGAATT 2640 

TCCATT6ACA TTTGAATTTG GACTTGGATC 7TCCCTTGGT COCATTAGCT GAGGTTTAGT 2700 

■ft ATCTAAAG T CGCCATAG^ TA1GATTATA ATGCTAamrT AAAAAATATA TATKTAAAAT 2760 

AmTTTTCT TTTTAAAATA GAjCACTATAG TTTTACCCAT AAGTAAVATT TAAAGATTAT 2820 

AGCTCX:CAAA AGAATGGACC AACCACTTTC GTATGATAAT TTCTTTTTGG TAAATATGRG 2880 

ACTATTATGA AATCATAGTA TATGATTQTA TTTAAAGGTA CAATCAAAGG AT ei ' i ' riUT C 2940 

CATTGCATTA ATAACIGAAT AAAAAATAAA TAAAATCGAT AGAAAAAAAC TAAAGTTGAA 3000 

AATACATTCT TAAACIAOTT 6TCEGAAATG AGAAAAfiAGT GAGAACTftGG X6TGCAAGAA 3060 

CC2AAACQTAT TITATTTTAT TTTTTAAATG GQAGCAACAT ATCABTOSTG TCACCAGCTG 3120 

GTATATTGTG TAAATATTAA AGCTCCATXG GGAGIGATTT TTCATGGCAA CATCASCITT 3180 

CISUWTQTTCT AAATTCXATA AAAACCACC3C AOUUUBAAAC AAAGCAAATT TOITZAXCTA 3240 
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A.TG»GTT6Cr GGAAAATCAT ATTGAGAATA ATTATTTCAG 
TACATTCAAG GGCTTATCTC TOOOCCCXST GATTTTTAAC 
&CTGTGGAAC CCTAAAGCAG TAAAATAAAA AACCTQGTTG 
CCTTAAAATT CCXXTTTTTTT CTCTATGTAC 6ATAAAGTAA 
TGGGGGGATG ASATXAGGCT GAGGCAGTGC TAGTCAACTG 
AATCACGCAO TTGTGCTATA TTTTTAAAGA AGGAOGTGGT 
CCCTGA5GTT AGCCCAATGS AGAAATGAAS CAOAOGAAaa 
TATCAGC3t3AC3 GAAGATC3TTC AATAGAACAT GG^GAATTT 
GGGCCAA1GG AGAAAATOAA TGGACiVAAGC TCAGGAATCC 
TTGGTGTTAT CAGGGTTAAG CCCTGTAATT ATGTAACCTA 
TATGATTlCr TGTQATOTAT TCT^FTTAIGA AATTAAGAftS 
AGGAATMTCA A T GC TT T ATC TGATATGCTQ AGAAATTATT 
CGTTTCATGT OTTTTATAAO GTTTGTTOCT TTGAAGAATT 
GAAATGTGTA TCTATTIATA TATCATAGTA TAAATCTATQ 
AAAGTCTGAQ TTCTCTTTCT TAGTCCCTAA TC&TGTTTCT 
GGAGCXATCG 6TTTAGGCTT TTAAGCTTCA TTA9CTTGTC 
A6AAATITIA QATATTATCA TAACATCXOG GXCTACTCAA 
TTATGTCTTG GAOCTATCAA AAACTGACTT TATTTATTGC 
ATCAACAATG ATTTTCTTOA ATQGGCATGA ATGGAGATGC 
GTTTCATACA GCTATTAAAA TGTAACTGAC CTCXrTTAQAG 
ACTTTOTATA GCTAAUTCjAC AGTCSICrTAA CITACATGAC 
CrCTGGTCCT GTGTCTTCAC CTCATTTATA GC3U3aTCTGC 
CTTCCC!AGTQ ATCTGTTCRQ TTAAGTTCTT CTCCOGTTAA 
CATCRCASTG GGAAGAATAG CCTATTOTCT TTCATTTTGe 
GGGCCCTGAA ATAAAAATTA TGAAATATOG TGAGCTCACA 
ATAAAA'l'i'Ci' AQGAOGGCA6 OTTAGGAQAC AfiTTATOXAT 
AGGGTGGGAT TACAAGCSGTG TTCCTCAGGC AT6QCCCTAT 
AAOAATTGAC TQATTTACAS GACTICTCTir TA17GTCAATC 
GGACATTTGT TCCACOOGAC CTCTGACTOA TG0rTTG(?JiA 
TATQAjCCATT GAAAAAGQAA AAATGXAGAC TCTGACTTCC 
AAAACCT7TA CTA6CKTTTA GA0CT3TTCA 6AACATOCCC 
iGBfifiACTGC AAGTAAGGCT ^TAATTTTA GGA6GTTTTT 
TAAATOGTAT GGCCAAAAGT CAGAGTTAAA ATATATATAO 
TCACTCTAAA AATATSAATCXr AAACCCACTC TTCATATATG 
TAGCAATCrC TGCTTTGCAA TGGGCACAAT CTTGGTCATG 
AGAOAfSSATC TASSATOBGA OAGCTAGAAA GTTSCTA3U:T 
GOGTTOGTCT ACCAATCTGG GA»BATTTGA AAAC3LAACTT 
GAAfSGCTGCT GCAAGTCATT GAGT6ACTTT AOGATGA6CA 
ATGCOCTATG TGTATAGTAC CAGAAGCAA9 GTCTCASACT 
CAAfigrgAGT CTGAACCa\AT A6AAAGCAAA CATGTGCAGA 
TGCAAGTOG6 OGCTGGCTAC CGGTC^ACT C^AQCakACASC 
CAATATTTAC TGAGACrTTCfi AAC3AIXX!M3C 2U3A3GTTTAA 
AAAOCCTCCA CTTCTCCCCC TOCCCTCAAA AAGCSCAACAQ 
AACC»:3VCAaA JUSaQOATGaa aaataaagaa aattctcica 
CACIGGTCAG OGTGGTTTIT ATGTGTATEA GGATTGGGGG 
AOTACTTTAT AACCAAAQCA ATTAAATSAT ATXeGGGSAG 
TTA6TTTTGC CATCACATTG TCACCCAiGAC CTCACCTAGC 
fiAAGAaOGAO AauaaaATGT GCO^iQAGTre ACCCASTGTG 
AAGA6TCATC GACCTCAGTT AGTOGTrGGA TOTACJTC3U2A 
TXTGECICCC TGGCAAGQAS AATATGGGGG ACAIGATGCT 
GTOAGAATGC AOQOGTGCftT ATGCXACACA TATGTOCITC 
GCTTTGGGAG ATTATCAGTA GAAAGAGIGT TATGATAT3?6 
TTATACAAm TGTTCTTGXA TTTTAATAAA CTTTQAATAA 
AAAAAAAAA 

Beq ZD NOs 600 Protein Beqngice 
Protein Aocea«ion #s HP 000794.2 



ATXCCTCAQT 
CTCAAAATGG 
CAGCACATTC 
CAQTTiTGTCA 
60GGAAAA06 
TTATGTGTGC 
AAA(2ATAGAA 
CTGGAAGAAA 
CTAOQCTATG 
TTTATCGCAA 
AACTCATTAT 
AGATTGCCAA 
GTAGTTCTTA 
ATATATTTAT 
COCATAGOCT 
TATTATTGAA 
ACACTTATT3 
TTAGTGAAAA 
CDGCRCAOTA 
GCSUaATTAGT 
TTTCTTTTTT 
TTGATTTTTS 
CCAGGAAOTG 
CTGAGTGTAT 

GGGCTTTGGG 

oaacccTATG 

TTAAGAGGAT 
MTAACTTTA 
GTCCCACX^ 
ACTSrCATGr 

TTAQATTCCA 
CTTCX3VGAAT 
TCCT6AGGCT 
GGGAAOAACA 
CTGGCAACTG 
AAM:A7TGGG 
TAACAGACCG 
TATCX»AAa\. 
A<3AGCICCAG 
TGAAGTCACT 
GTAAACAiCAT 
AGACTTCTCC 
ATGTQAABAA 
GGAAXGTTGG 
COCAAGTAAT 
GGGATGATAA 
TTAGmGCJC 
AAGAGCOCTG 
TC3U3TTGC2U3 
GTGCTGAGTG 
AAOAAXAAAA 



TGTTAACTTC 
TGTGAGATTT 
ACACTGTTGT 
GATAAGCOGQ 
ATGATGOAAA 
AGACAATTCT 
AGACATGGGC 
GiaCTQTQGAA 
TAG?ATGTrC 
CATGAATTTT 
TTTQAGGTAQ 
TACrCATGTG 
GTCCCACAGO 
ATCATATATA 
GTGTTTACAT 
ATAGTTICCA 
TTTQAAAGAC 
TACTAGTGGG 
ATGTAQAAAT 
AACTGTTCCr 
CACATTGGGT 
GTAGTATCAA 
cttattctct 
TTTACTATTT 
GCCTTGCrGC 
GAAAATTCAA 
TGGAAGCAAG 
GGATGAATCT 
ATTAGGATCA 
AGQATTAArra 
GTCICAGCAG 
TTTTTTC5CCC 
ACTTCCTCCT 
QGOaCTTAAO 
CTCTAAGAAA 
AQQCOCTGAS 
AAGGAAGGCT 
CCACTTCCTA 
AGCTCTGTTC 
AGACTQCTCA 
GG2USCTTATT 
ATTTTQGCTC 
AAATGAAAGA 
A©aCXX!ATGT 
ATAAGTATOC 
OC3U3TTTTGT 
CGGGCGCCCC 
CraCTGR0(3A 
TCTCOCXaVTC 
OGTAAATGTG 
AAAATGAAGT 
CTA'XXjrUTQC 



51 



1 
I 

OCATSGATGQ 
ATGGAGATAT 
CCTCCTCCTT 
CAATGAAGTG 
TCCAXCACAT 
TTAOCAGGTG 
CCCCAGGAAC 
TAGCATTCCA 
TGATGATGAT 
AGCTOATGAA 
GATTAGIUSaA 



ix 

I 

TAACTTCrCC 
GCTCCTCrCA 
CTCAjSCTGCT 
AATCTACIGG 
GOOrGGGAAG 
TGCAATGTCR 
TCAGCTCAGA 
TTGOTTTXAQ 
CKTGCX3QTGA 
AGTTTOVCTC 
GTAGGTCCTG 
GCCTTGGTGT 




CTCSTTCXCGA 
ATTCAAAAAC 
AGATCAGTGG 
TG6ACCACAB 
ASATTTATGT 
GAACTTGCAA 
AATTTC8AGA 



TGTGGKTGAA 
TCAAAACAAT 
GGAGCTCAAG 
GGA^ACATTC 
OCATCASTTT 



TCAACAAGAA 
CCGTGZUSAGT 



GGGATTTTnr 
ATACTTCAAA 



41 
I 

TCACATCAGT 
ATGGATTGTC 
GAACTGATTC 
GHGCrGGGCT 
CATTACACAC 
XGGCtGAGAA 
TTCACTCTAC 
AACCTGTACT 
AOUUkGATXe 
AITCTGAAGC 
TTGGCATTTC 
AAQTQCOCAT 



3300 
3360 
3420 
3480 
3540 
3GD0 
36^0 
3T20 
37B0 
3840 
390O 
3960 
4D20 
4080 
4140 
4200 
42fi0 
4320 
43 &0 
4440 
4500 
4S60 
4€20 
4660 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
516D 
5220 
5280 
5340 
5400 
5460 
5520 
55B0 
5640 
5700 
5760 
5B2D 
58B0 
5940 
60QO 
6060 
€120 
6180 
6240 
6300 
6360 
6420 



1 U 21 31 41 

I I I I 1 I 

MGII.BVDLLX TUOfLl^eVFEB NCLFLALYDS VIItLKHWU. IiSRSXSTiUaE HOtRmJCSBBL 
RCVHKSFUS ATKQfVKLf^ ASHSSWHVg STEGGDHStar GTOEKIAEGA TCHLLDFASP 
ERPLWIOFGS ATUPl>PrSQL PAFRfCLVEBF BSVAnFCJJVr XSBAHESDGH KLPGDSSUSF 
BVKKH0NC2ZD RCAAAjQQ|I>Ii5 RF&IiPPQCeV VKDSMOmOX lAYGVAFERy CZVOSOKIAY 
LGGHGPFSYN LQEVIUIKLEK 19FSKRUKKIR lAG 

Seq 10 M0» 601 saflR aequaiice 

nuGleic Add AocesBlon ftt BlK_Od5233.1 

Coding sequence: 1D1..3052 



60 
120 

lao 

240 



51 

I 

GGCATOCITC 
AGCTCTCCaT 
CGCACiCCTTC 
GGATCTCTTA 
GCATCaaOAC 
CAAACTGGGT 
GAQACTGCAA 
ACATG6AGTC 
ACACCATIGC 

AAGAIGITGG 
TTACAGXGAA 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 



1182 



wo 03/042661 



PCT/US02/36810 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



GAATCTQOCT 
AGGGTCTTGT 
AGGCGAATGO 
TTTTATGTGC 
TGCTAAQTQC 
GAATAATTAC 

GCOCCTGOAC 
GTGGAATATA 
TGGACTCAOC 
TGAGATTGAT 
GOTCAGCATC 
GACCTCCAGA 
ATTGGACTAC 
GAGGGCAAGA 
CCAAATCCXSA 
AACTASTOCA 
TTCAGOGOCA 
CTGTGGCTAT 
TTTAAAACTT 
AGCTGTXCAX 
TGCMGCAQQr 
GATTTCA6TG 
CCTGGGAQAA 
AGTT6ITACC 



TCGAGGGATA 
CX3CTGCTC3GG 
TTCJOCGTOTC 
AATCM9GTGG 
ASKSQAOTTAT 
GATGTCCAAT 
GOACTOCCCA 
CAGACOCAAG 
CCTOAAOATC 
TGTGGATATC 
CTQCUOGAA 
CACAGATGAC 
rAOCATTAAA 
G6AAGTGCXT 



ATGTTTCCAG 
GTCRACAATT 

CTTGTACCCA 

CCGCCTCACft 
TTCOSOQCftG 
AATGTTATCr 
ACAGGAGGCC 
AAACASrGTG 
AACAC3CAOGG 
GCCGTTAATQ 
ACAACTAATC 
AATAGCATCT 
GAGGTCAAAT 
G6CACAAATG 
GCCC3GAACAG 

gactctttct 

GTAGCAATTA 
AAGTCAAAAC 
CCAGGTCTCA 
GAGTTTGCCA 
GAATTTGGAG 
GCCATTAATVA 
GCAAGCATTA 
AAAAGTAASC 
CTAGOTAAAC 
GCATCTGGGA 
AACATCrrGA 
CTGOAGOAlta 
AC2VXCACCAG 
GGt^TTSTTC 
CAGGATGTAA 
GCTGCCTTGT 
TTTGAGCAGA 
ATCAOCAGTQ 
TCTAOCTTCC 
ATCTTCACGQ 
ATGAT^AAAGG 
GCTCTAGAAA 
CTGGAC66AA 
GOAOATACIG 



ACACGGTACC 
C3fAAGSAGGA 
TTGGCAAGTG 
GACCRGGTTT 
GTTCTACTCA 
ACRAAtaACCC 
CTAATATAAA 
GGAAAGAT6T 
AfiCCATGCAG 
TGAIZAGTGAC 
GGGTGTCAGA 
AGGCTGCTCC 
CTTTGTOCTG 
ACTATIGAAAA 
TTACXMC2U3 
CCGCTGGATA 
CCATCTCTGG 
TTCTCCTCAC 
ATGGGGCAGA 
GGACTTATGT 
AGGAATTGGA 
AGGTGTCCAG 
CCCTGAAAGT 
TGGSACAGTT 
CA5TTAT6AT 
AOGSATGCCCA 
TGAAGTACCT 
TCAACA6TAA 
ACCCJU3AAGC 
AAGCTATAGC 
TCTOGGASOT 
TTAAAGCTGT 
ATCAGCTQAT 
TTGTTA6TAT 
CASCOGCAAG 
GCACAAGAG6 
GCGTQGASTA 
TTGGTGTCAC 
CGCAATCAAA 
GTGGT6GCTG 



CATCSOACTCC 
AGATOCTCCA 
TTCCTQCAAT 
CTACAAGGCA 
GGAnSAXBGT 
TCCATCCAaCG 
OaASACCTCA 
TACXrrTCARC 
CCCAAATGTC 
A3ACCTTCTG 
GCTGAGCTGC 
ATCACCTGTC 
GCAAGAACCT 
QCAGOAACAA 
TAGCCTCAAG 



TGAAA6TASC 

TGAAAAAAGA 
TGACCCACAT 
IGCCACCAAC 



TGGCTACACA 
tTGACCACCCC 
TGTCACAGAA 
GTTHACtGTC 
OTCAOACATG 
CTTGGTGTGT 
TOCTTATACA 
CTACOGC2VAG 
GATGTCTTAT 
AGATGAGGGC 

□cTGOAcrac 

TCTGGACAAG 

TGftCTGGCTT 
CAGTTCTTGT 



QAATGGCCCA 
TGGAAGGOST 



CABTOOCTGO 
AGGATGTAGT 
GCTGGCTATG 
TTGGATGGTA 
TCAATGAACT 
GCTTGTACCC 
GTTATOCTGG 
ATCATATGTA 
CSSCTTOCTCC 
GCAQCTACIA 
CCACC7WAC 
CTGACGATTA 
GAACRTCCTA 
SAAACAAGrr 
OCTGACACTA 
AG00SCAA8T 
CAAGTGGTCA 
^CATGTTTTGA 
CTTCATTi'i'G 
AOWTATGAAG 
AIATOCRTTG 
AftACTTOCTT 
GAAAAGCAGA 
AfilATCATTC 
TACATGGAGA 
ATTCnGCTAG 
GOCTATGTTC 
AAGGTTTCTG 
ACAAGAGGBG 
TTCAOGTCMJ 
GGAGAQAGI^ 
TATCt^UZTOC 
TGGCAGAAAG 
CTTATCOGGA 
CriGTTCrOG 
AATGS3GTCC 
GACACAATAG 
CCRCftGAAGA 
GTTOCCC3T6T 
CAAGTCATCX? 



TGGAGGTTAG 
GCASTACAGA 
AAOAAAGAGG 
ATATGAAGTG 
GCAGGTGXGA 
GACCTCXATC 
ACTGGAGITG 
AAAAATGTGG 
CTCGACAGTT 
ACTACACCrr 
ABTTTOCrSC 
AGAAAQATOG 
ATGGGATCAT 
ATAOCATTCT 
TATAOGTATT 
TTGAGTTTGA 
TGATOGOCAT 
XTGGGAGGrr 
GCau^TGGGCA 
ACCCTACCCA 
ATAAAGTTGT 
CAAAAAAAGA 
GGAGAGACTT 
GACTGGAAGG 
ATC^TTOCTT 
T6GGGRIGCT 
ACCSOAGACCT 
ATTTCGGACT 
OGAAGAO^CCC 
CCA0CGATGT 
CAXACTGGGA 
CAGCCCCCAT 
ACAGGAACAA 
ATCC5CGGCAG 
ACCAAAGCAA 
GGACAOCACA 
OCAAGATTXC 
AGATCATCAG 
AAAGCAGGAC 
TGCAGACAGA 



Seq ID HQs 602 Protgin giyijiiyryce 
Protein. Accessioao. #: NP 005224.1 



1 
1 

HD0QL8ILLL 

HYTPIRTYQV 
mSXYKESDDD 
laAFQPVGACV 



PPRQFAAVSI 
BTSXnUlAR 
OWHIAXSAA 
TYKDPTOAVH 
SKQHJU^iFIiGS 
ZQliVI94UtGl 
TSGGKIPUW 



11 
I 

LSC&VI1D6F6 
CWVHDHSQWN 
HGVKFRBHQF 
AIiVSVRVYTK 
LVPIGKCSCN 
FRADKDPPSH 
ICQCEPCSPNV 
tTESlQAAPSFV 
OmVTISSIilC 
-VAIZLLTWl 
FFAKBLDA1N 
ASIMGQFDHP 



21 
I 

TKISTIAADS 
IDCPFTViaSIiA 



31 
I 

NLUD8KTIQS 
3AQKIYVBLK 
SFTQMDLGDR 



FQKKZ2GS1K 



TSPEAIAYSK 
AAI>VQLKU3C 
STFRTTGDHL 
ALfiXQGKNGP 



ACTRPPSdFR 
RFLPIbQFGItT 

i/riKKDRrrSR 

POTiyVFQJR 

Z8IIIKWGAG 
KIIAIiSGWT 
GYVHRDIiAAA 
FTfEABSVMSY 



QACEtPGFTKA 

nvz&HXDsrrs 

NTTVTVTDIili 
N&ISLSKQfiP 



41 

I 

SLGWISYPSH 

QfiLVEVBGGC 
LUGNKKCmCC 
VIIJJHGKPm 
AHTWYTFEID 
SHPHGIXLDY 



HGVRTnBCXB 



KSKUGADEKR LBFCailGHUCL 
EPGSVCSGRXj KZ(FSKKBZ?V 
K8KPVMIVTE YMENGSLD8F 

NzitiNsmivc scvraoFGEiSsv 

<SVLHEVMSY GSRFYnSHSH 
FBQIVGIUIK LlRMPS filiKI 
IFTGVEVSSC STZAKISTIID 



Seq ID HO; 603 PEtA Begaence 

Kuclelc Acid Accesaion #: NHJ)05727.1 

Oodlng sequences 122.. a4T 



1 
1 

GCCABOGGTC 
TGGAGCCTCA 
CATGCAOTOC 
GTGTG6TGCA 
GAAGATCTTC 
CATOSCAGCC 
rcGAiGAGCAAG 
QGTTGCAGCT 
GC1X3GTAGT6 
GAACAGCAGC 
CTCftCCCTAC 
CAACACAGCC 
CTTCAATCAG 



11 
I 

OCrCTQCCTG 
GCAGTTCCCr 
TTCAIBCTTCA 
6CCCTGTTGG 
GOaCCACTOT 
G60GTTGTGG 
TGTGCCCTCQ 
GCTGTGGTOG 
CCTGCCATCA 
ATGAAAGGQC 
TTCAAAGAGI^ 
AAaGAAACCT 
CrTTXGTATO 



21 
I 

OOCACTCAGT 
CTTTCAGaAC 
TTAAGACCAT 
CAGXGOGCAT 
COTCCASraC 

TQACQTTIJli'r 
OCITGGTGTA 
AGAAAGATXA 
TCAACSTGCTG 
ACnSTGCX^XT 
GC31IDC3UUSCA 
ACVTCOSAAC 



31 
I 

GGCSIACSWCCC 
TCACTGCCAA 
GATGATCCTC 
CTGGGTGTCA 
CATGCS^GTTT 
TGOTTTCCTO 
CTTCATCCTC 
CAOCACAATO 

togttoocag 
tggcttcaoc 

TOCCCCATTC 
AAA9GCTC2AC 
YAATGCA3TC 



41 

I 

OGGASCIGTT 
GAGCCCTGAA 
TTCAATTTGC 
ATOGATGGGG 
6TCAACGT6G 
GGCTQCSATG 

crocTCATcr 

GCIGASCACr 
GAAGACTTCA 
AACTATACGS 
TUUrrGCAATG 
GACCAAAAAG 
ACCX^SGGOTB 



7S0 
840 
900 
560 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
15€0 
162 Q 
1680 
X740 
1800 
1860 
ld20 
^9B0 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2530 
2580 
264D 
2700 
2760 
2820 
3080 
2940 
3000 
3060 
3120 



SI 
1 

GNEBIBGVDB 
LVI/GTCKSTP 
VGFVNKKGPy 



TGGRKDVTEN 
AVKGVSELSG 
BVKYYEKQEQ 
SSFSISffiSS 
PGIATYVDPB 

Azirounnjyx 

liRKHDAQFTV 
IiEDDPEAAYT 
QDVIRAVDBG 
ITfiAAAftPStf 



60 
120 
3180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
7BO 
840 
900 
960 



51 
1 

TTGTCCTTTG 
CAGGAGOCAC 
TCATCTTTCT 
CATCCTTTCT 
GCTAClTCXrr 
GTGCTAACBIC 
TCATTGCTGA 
TCCTGAlCGTT 
CTCAAGT6TG 
ATTTTGAlBGA 
ACAAGGTCAC 
TAGAGGGTTQ 
GTGTGGCAGC 



60 
120 
180 
240 
300 
3€0 
420 
4B0 
540 
600 
660 
720 
780 



1183 



wo 03/042661 
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TC3C3AATTGOG 

COCTGGCAAG 
AATGGACCia 
ATOCKTGACr 
GTAGCCnSIT 
TAGTGGTGAT 
AACTGAAA.TC 



GGC3CrOGAGC 
CACTTCTGCC 
CRQCftGTGAl' 
CCCTTTCTOC 
TTCCTTCCAT 
CT6TTGCCCA. 
CCCAGTGCTC 
AGCAGAGCCT 
TTAAAAAA 



XGGCTGCCAT 
TCTGCCACTA 
TOGGGGAGGG 
TCCAGACTTQ 
TG6TGOGTOG 
7TCCCCCAGT 

CTGGGTGGAT 



GftlTGTOTCC 
CTGCTGOCAC 
GACAGQRTCT 
GG<5CTAGATA 

CTATTAAACC 
TGAGAGAAAG 
GTCsIAGAAGG 



ATGTATCTGT 
ATGGGAACre 
AACAATOTCA 
GQSACCRCTC 
GGCATTCCAG 
CTTGFATATQC 
GCSL'JTf'fTATA 
CACTTCAAAA 



ACTGCAATCT 
TGAAGAGGCA 
CTTGGGCCRO 
CTTTTAC3C3GG 
AGCCrCIAAG 
CCCCT'AGGCC 
GGCarGGGCAT 
TGCATAAAOC 



&eg ID BDi €04 Protein. Hcquence 
Protein Accession #s K9_00S71fi.l 

1 11 21 31 41 SI 

I 1 1 I I I 

MQCFSFIKTK MILFMLLlFIi CGLAAIitAVGI WV&lBGAfiEli KlFGFIi&SaA HQWSVQYPIa 
IAAGVVVFAIj GFI^CTOSUCT SSKCKLVTFF FIUJblFJAS VAAAWAXiVy TTHAERFLTL 
liWPAXKKDY 68QBDFEQVW NTXMXGLKCC GmXTEDPSD &FYFKENSAF PPFCX3ai»iVT 
VTANSrCrBQ KAHDQEVEQC F^UmXRT KAVTVGGVAA GZGOLEtAAM IV8MVLYC33L 
Q 

Seq ID N0& 605 SHA sequence 

lAlcLeiC Acid Acceaaicm H]4_0a0729.2 

Coding oeoaence; 



1 
1 

GGCTCAGCTG 
AGOCATGAAC 
GAOGCAGCCG 
GGCXXXTTAGQ 
G6CGCTGCTG 
OGTTAAGAAC 
CXOGAOXSGAT 
GCCGOCATCA 
TCACSiCrCAT 
TCAATOTBAA 
TGXGGAAACI 
TATGCXATTA 



11 
I 

CCQGOCTGCT 
AGCGGOGTGT 
GTGCCTGCOB 
CAGCTGAGGG 
GCAA9ATACA 
CTGCAGAACC 
TTTGQCCOTC 
GCCCAAGGGA 
AALTrUftTTOT 
AATTGTGTCT 
GAAQACAAAA 
AAGTGATTTC 



31 
I 

COGGTTGGAA 
GCCTGTGOGT 
C3iGATCCOOC 
TATGGCAGA6 
TCC9i6CAC3GC 
TGGACCCC2VG 
GCAOTOCCSaA 
AGCAACCTCC 
CTBTQGAlGXT 
GTAAGATI6T 
CTOTTTTCTT 
ATXCTGCC 



31 
I 

AOGDCAAGCC 
GCIGATGGCG 

AACGGAT6GC 
CCGGAAAGCT 
CCACAGGATA 
GGAGTATGAG 
CAAOGCAGAG 



OCSUsTGCAAC 
CATCTOTGAC 



41 
1 

AGCTGCC3GTC 
GTACTGGCGG 
CTGCAGOGG6 
GAGTCOOGAG 
CCTTCTGGAC 

TACCKCrCCT 
GAGGCZUSAAT 
TGTATCTATT 
CACACACGCI 



51 
I 

CTAATCCAAA 
CTGGCGCCCT 
CAGAGGAGGC 
CSsCAjCCTGOQ 
GARTGTCCAT 
ACTACATGGG 
AGAGGACCCA 
AAGACAACAA 
TATTAAGTTC 
CACCAl3AAaT 
AAAATCTT6T 



Se^ ID HOs 6D6 Protein eegfoenoe 
Protein AccesBicm fts HPjOa0720.X 

1 11 21 31 41 51 

1 I I t i I 

MHGGVCLCVEi HAVXAA6AZ.T QFVPBADPAG SGLQEIAEBAP BRQLRV8QRT OGB&RAELGA 
IiZJUnaCQAR KAPSGRMSrV KHK^NIiDPaH RISnEtDYHGi? HDFGRRSA^ YEVP? 

Seq ID KO: 607 UNA seQiifence 

Mucleic Acid Accw^icoa IIM_001423,1 

Coding Bcquences 219.. 692 



1 
I 

A£3CACTCTCC 
ATAACCTCG6 

cAcdoacnscr 
tctttgtgqt 

GQTTGOTTTC 



IIGATTCTCTC 
C3CATGGAGAA 
GCATTCTTQT 
ATCACCACG6 
GGGTXGTCIA 



CGGAGAXAGG 
TGAGGGGATT 
GATOCTCCCT 
GTTCTTAGXG 
TxaGGTCATGC 
TGGCAAGAGC 
TCTGACCCAA 
GTGAGCCATT 



CTTTCTCACAA 
GTCCCTCATO 
GCTCTTCTGG 
AACIATGGGA 
tCTOCCCrCAA 
TCAXTQTAGC 
A'PCAATATTG 
GAGIOATCAC 



11 

1 

AGCCTCTCAC 
GAlSGCGGGTC 
GOCAGCACCT 
ACAAStlTACC 
CCACATCGCT 
CAAXAOGGXA 
GUSOCTGTCA 
TATCATCTTC 
GGGAAACOGG 
GQGCSGTQTCC 
CTATTCCXAC 
TCIQGTCCTS 
ASGSAAGOOGS^ 
GGAGGGGGGA 
CTCTACTQCC 
TGATG6GGTC 

ACACacaCTQ 

ACTGAOGTCG 
AGATACTGIC 
AGCAAAACAT 
GTCCCTCTIT 
AGAGAGTTT6 
TGGAAAACCT 
GAGACCTCAT 
AGTTTCTCTA 



TGCA6TCAGT 
AOUVTCCTAT 
AGGATCAOGG 
CTCl!rGGGGA 



21 
I 

CGGAAAATTA 

GCCACTCAGA 
AAAAAAAAAA 
ACTGTTATTA 
GATGCATCAS 

TGIGTCATTG 
TTCTTCCTCT 

atctagacta 
atcctgggct 
agaaagaaat 

TQAATCTGQG 
GGGGGAAGCA 
AAGCOCCTGC 
CAGAGAGOCar 

ACSUSACCTAC 

GACA1TCCAG 
GGAACaGATA 
TAASGTCATG 
QGGGGATGGT 
GCCATGGTCX 
AAGTCACTAG 
AGTCTTGCAA 
GIXTCTTTTA 
TACTCTTCCT 
CTCClAGGCr 
CXXrTGCCTAT 



31 
I 

cAica\GoccnG 



GGGCCTCTGr 
GAfi<!3C3UU:!AT 
TGCTATTTGT 
laSSTCTTTG 
AAGATGCCCT 
CCCTGCIGGX 
CAGC3G0CCAC 
GTCATTATGC 
GQATCTQCTT 
AAGGCOGGZ^ 



AAGGGGGGAG 
CCTGGGGAGA 
CXXTGCAGCC 
CATCAOCTGC 
tTGCACIQAST 
ATAOQCTAAG 
I^CTGAAGIGC 
TTTAGCTCTG 
CIGGTGGGZT 
C3U3AGCCCAG 
TXGCTAGGCC 
TGAACAATTC 
CAGTGCXSVTG 
AGTATACAAC 
CIAACA'i"l"X"r 
CAGTGGXAGC 
CCCACTTCAC 



41 
1 

TACZ^OCAGCA 
ATACITCCAG 
GGCIGGGACC 
aTTQC37ATTG 
TAQCACJCSm* 
GAAAAACIOT 
CAAGACAG!rO 
CTTCXSTGTIC 
CACftCTCSGTQ 
GAAT06TGAT 
CTQCTTCAlSC 
GAGTTCATGG 
AGQTTGCTQT 
GTCAAATCCX: 
AAGTAGTTGG 
AQQ^QACTTQ 
CACAACAOCA 
TAAAATAGCG 
OCTGGAAGCC 
CTACDGGGGG 
TGGAATTCAG 
AGCTAAACCA 
TOGRGAGCTC 
TCTTGCIGAA 

ggtggi:aaaa 

TTAGGGTTAT 



51 
1 

GAGGAAACTT 



CTTCAOAAiCT 
CTGGCTGGTA 
GCCAATGTCT 
ACX3ACATTA 
GAGGGCITCA 



TGCXGGCTGT 
GGAACSGCAGT 
TTCATCATCG 
GGAtCXGGGG 
AC3U3GAAAAA 
AAACXMTAC 
CTAGTACTTT 



OTACAAGTTC 
ATCCTGCOCT 
GCTTTGGCCT 
TGACAAAATG 



TGAGGAACTT 
TCTGGCXXAG 



ACACACGGCT 
AGOCAAGGCA 
GTACXACACA 
GmTTAGGA 
tqgacatggc 
TIQTCTAATT 
ACACCACCTG 
TGGCAATTCT 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 



60 
120 
ISO 
240 



60 
120 
180 
240 
3DD 
360 
420 
480 
540 
GOO 
660 



60 



60 
120 
IBO 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
84D 
900 
960 
1020 
lOBO 
1140 
120 P 
1260 
1320 
1380 
1440 
ISOO 
1560 
1620 
1680 
1740 
1800 
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□GAAQCTG&T 
. CTATATAATT 
GCATTOCCAG 
TCTTTTCTTT 
CTOTCTQCTG 
TAAf3CW3GCA 
GOOGrTTTTT 
1GAGACATCI 
TCCTTTTTQG 
ATTCTATTXC 
CTAGGCTGAG 
TQAITTCATT 
CA6ACATATC 
ACCATSTATT 
ATTACTCTGG 
ATACTATAAT 
TTCTGATTCC 



TAAAACACAC 
GTGAAGIATT 
(SAAAATAOGA 
AAAATAAAAA 
OCGCftSGAGC 
TTC5CTTTGCC 
GGGAAGAOGT 
TGCXTTACTTT 
GGAGTTGTTA 
TCTATGTTTA 
GTTAGAGACSA 
ATCACATGAT 
OUMGGGAAT 
GCCITATCTT 
TGGATTGTTC 
TGTAAATATT 
CTTCAAAAAA 



ATATIACCAAA 
AZUSCCTACOQ 
AAATCCCAXG 
AGCAAAAACT 
AqCTCTATAC 
CTGGAQCAGC 
TTTCTTTATC 
TCTTTATTAG 
TGCCATGATT 
TTCTAGTTAA 
TTGGCCAfiCA 
TATAGAAGGC 
ACTCACATXT 
TPACTTTTTT 
TAOTACTGTA 
TTGATACAAA 
AAAAAA 



ACCAAACAAC 
TATTTCAGOC 
ASATAAATAA. 
CTTGTGOTAC 
AGGACTTAGA 
TATTTTAAGC 
GCCCTGAGAA 

TnTGGTATTT 
OGAAATGITG 
AAAACTGTGG 
TGTCTTAGTtS 
TGTTAAGAAa 

TcraxdACAT 



TQTT7ATAAC 



AOGOCCTTGG 
ATGATAAGAA 
AAATATAGGT 
CTKaXCAGAT 
A5TA6TAT6T 
CATdOySAT 
GATCTACCCC 
ATCCATTTCT 
ATTQXAAAAGG 
AGGGCAAflOC 
13AAGATGAAC 
CAAAAAACAT 
TTGAACnAlG 
TTATGTCrCA 
TCGTTAATA6 
TCTAGGQATA 



GTGAAAGGT6 
CAGAGTGGCT 
GATGGGCAGA 
OGTAGACGA6 
TATTCCTGCJT 
TCTGTCTAAA 
AGGGAGAATC 
TTTATACCTT 
ATTATTACTA 
ACCAAATTAC 
TTTGTCATTA 
ACITACATTT 
ACTGGACTAA 
TGIIAATTTGC 
ATTATTTCAT 
TAAAAAICMBA 



1860 
1920 
1980 
2040 
210D 
2160 
2220 
2280 
2340 
24t>0 
2460 
2520 
3S80 
2640 
2700 
2760 



eeq ID NOs 608 Protein eecnience 
FroCeln AccessioxL #r 1SJP_001414 - 1 

II 



1 II 21 31 41 51 

1 I 1 I I I 

MIiVIiLASIFV VHIA3VIHLF VSTIANVnZiV SMTUDASVGi:! WRXCTIIJSCS DSLSYAfiSDA 
LKTVQAFKXI. SIXPCVIAUi VFVFQLFIHB KGNKFFLSGA TXLVCKLCIZk V(3VSIYTSHr 
AMBDBTQmiH GYSYILGWIC FCFSPTXGVIi YLVIAKK 

Seq ID KOs 609 DMA aegaence 

snacleic Acid Accession ft : llM_0049ei . 2 

Coding sequence: 55.^1575 

1 11 21 31 41 SI 

I I I 1 I 1 

GOCAGASCST GAGOOGOOAC CTCCSCGCAG GTOQICaCBC CXSGTCICOGC GGAAA1GTTO 
TC5CAAAGTTC TTCCAGTCCT CXITAGGCATC TTATTGATCC TCCRGTCGAO GGTCGAGGGA 
CCrCAGACTG AATCAAAGAA TQAAQCCTCT TCCCGTOATO TTGXCTATGG OOCCCAGCCC 
CAGGCTCTGCS AAAAtCAGCT CCTCTCTGAG OAAACAAAGT CAACTGAQAC TOAGhCrGGG 
AGCAGAGTTG GCAAACTQCC AGAAGCCTCT OGCAXCCTQA ACACXATOCX GAGTAATTAT 
GACCACAAAC TOdaCCOSGG CaKriOGIU3A0 AAQCCCACTG TGGTCACTOT TOAOATOGCC 
GTCAACAGC3C TTGGTCCTCT CTCTATCCTA GACATGGAAT ACACCATTGA CATCATCTTC 
TCCCAGACXIT GGTACGAOSA ACGC3CTCTGT TACRAOGACA CCTTTGAGTC TCTTGTTCT(5 
AATGGCAATG TGGTQAGCCA QCTATGGATC COGGACACCT TTTIIAjSGAA TTCTAAGAGG 
ACCCKSSMSC AXfSAGATCAiC CAT6GC3CAAC CSlS&TtSGTCC GCAICTACAA OOATGaCAAG 
GTGTTGT&CA CAATTAGQAT WXXrrGXT 6CCGGATQCT CACTCSCACAT GCTCAGATTT 
CCAATOGATT Cra«:TCTTG CCXTTCTATCT TTCTCTAGCT TTTCCTATCC TGASAATOAG 
ATGATCTACA AGTGGGAAAA TTTCAAGCTT GAAATCAATG AGRAOAACTC CTt3GAAi3CTC 
■rrCCAGTTTQ ATTTTACAGG AGTGAGCAAC AAAACIGAAA TAATCACAAC CCCAQTTGGT 
GTkCXTCATGG TCATGAGGAT TTTCnCAMT GXGAGC&SOC GOmOBCSA TSTTGOCTTT 
CAAAACIATV3 TGCCTTCTXC CGTGACCACC? ATGCTCTCCT GGGTTTOCTT TTQGATCAAjS 
ACAGAGTCTG CTCCA5CX3CG GACCTCTCPA QGGATCACCT CTOTTCTGAC CATGAOCACG 
TTQGSCACCT TTTCTC!GTAA GAATTTCCCG OGTQTCTOCT ATAICACAGC CTTGQATTTC 
TATATOGCCA TCTGCTTCGT CTTCTGCTTC TGCGCXCTGT TGGAGTXTGC TGTGCTCAAC 
TnCClGAlXnr ACAAiQCAGAC AAAAGCCCSVT QCZXCTCCTA AACTOQQCCA TCdCGOKtC 
AATAOOCGTG CCCADQCCCS TACSOGSTGCA OGTTCCCQflG CCXCTGOCOQ CCAACATC3U3 
GAASCTTTTG TGTGCCAGAT TOTCACCACX Gi^SGGAAGTG ATGGAGAGQA GOSCCCGTCT 
TGCTCASCCC AQCAGCCCCC TAGCCCAQQX' A£9CCCTGAGG GTGOCOGCAG CCTCTaCTCC 
AAGCTGQCCT GCTGITGAGTG GXCCAAOOST TrXAAGAAGT ACTTCTGCAT &IICCCQS»X. 
TGIGAGGGCA QTACCTOGCA GCM3GGG0GC CTCTO CZATCC A3X3TCCACGQ CCTB SATA AC 
TACTOSAlSAG TTGTTTTCCC A)9TaAC!lTTC TTCTTCTTCT. ASCPXQCTCTA CTOGGTT6TT 
TGOCTTAACT TOTAGGTACC AGCTOGTACC CTdTlGOGGCA AOCXCTCCAG TTOCCCAGGA. 
GGTCCAASOC CCTTGCCAAG GGIUGXTGGGS GAAAGCAGCA GCAGCAGCAG GAGOGACTAG 
AGTXTTTCCT G0CCX3^TT0C OCAAACAiSAA GCTTGCAGAG GCxlTA'GTCTT TGCSGCCCCr 
CIGGCCIACC T93GCCATIC ACXCSLGTCIT CTCAGCAlQAC CAXTICAAIVT TATXAAXAAA 

TGGecxau:cT occrcrrcTT caassascat cggt^tgct cAdsaircnA aaccacagoc 

ACTTAQTOAT CASClCCCXA JOACCJOQCC TAAGTACAG6 OGGATTAGCI ATCTTCCAAC 
AATGGTGACC AOCAQAOU^T TACTGCAXTT TTOCAGAAQIC OCACTATTGC CXTTGTAGT6 
CTTTC6GCCC A&TXCTGGGC TCAGGCTCAA AlSTGCACCGA CTftSTTG C T T GCCTATAGCT 
GGCACCTCAT TAA8ATGC30 GGCAGCAGTA TAACRiOOAGB XmOSKSCCC TCTCCITTG6 
TCAQATTATT ATOXTCIGlua TTCXCTCICC CTQCTACCCC TTTCf CXGCA GATAGATAGA 
CACXGGCATT ATCCCTTTAG OAAGAGGGQG GGGCAGOU^ AGAGCCTATT TQGGACAGCA 
TTCCTCTCTC TCTQCrGCTG TGACATCax:C CTCTCCTTQC TOGCTCXaiC TTTOSTCTGC 
ACIACCAATT CAATGCCCTT CATCX»AT(3Q GTATCTATTIF TTGTGTGTf3A TTATAGXIAAC 
TACICCCTGC TSTSKSlSt&CC AOOCTCITGC TTCTCTTTQA CCCCSGXGhC TGITTCXGTA 
ACTTTCGCAG TGACTTCCCC TAGOCCTGAC GCnOBCACTA GGCCTTGGTS ACTTOCXOGG 
GCCAAGAAAjC TAAGGAAACr OGGCITTGCA ACAOGCATTA CTCGCCATTG ATT66TGCCC 
ACCCAQGGCA CACTGTC30aA <JrTCTATCAC TTGCXTGACC CCTGGACCX3^ TAAACCAGlC 
CACTGTXATA C3CCC3GGGCAC TCTAAOCATC ACAAXCAAIC AATCAAATXC CCTTAAATTT 
CmiOGCACr GGAACXXXGG CAAAGCACTT TXGACAM3TT GTGTCiaATT GGAGCXTCAX 
GATAGCCTTG TGACATCTTT AGGGCAGGAT TCXTATCCCG ATTTIGCAGA TGAAAACCCT 
GAGTCACAGA TTTCTGrGOG ACTGTOOAXC TCACTGGAAS CIATCCAAGA GCGGACTGTC 
ACCTTCTACSA OCACATGATA GGGCTAGACA GCTCAGTTCA CCATGATTCT CTTCTGTCAC 
CXCIGCXGGC ACACCAGT6G CAAOGCSCCAG AATG6GGAOC TCTCXTTAGC TCAATTTCIG 
GGOCTGAQGT GCXC3tf3ACXG GC300CAAlGAT CAAATCXCTC CTGGCTGTAG TAAGCCAGTB 
GAAtSAATTT GQACATQCCC CAATGCXXCX AXATGCTAAfi TGAAAlClGr 6XCTGTAATT 
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180 
240 
300 
360 
420 
480 
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600 
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720 
780 
840 
900 
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1060 
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1260 
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1620 
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TGTTCGGGOG GQGGTCTCCA. TCTACTTTTT GTCAOCATCIl 3120 
GGAAATATGT AAATAAATAT ATCAGCAAAiG CAAAAAGAAA AAAAAAAA 

Seqt ID SO; 610 Protein aequence 
Protein Acceselon #: NP_004952.1 

X 11 21 Bl 41 SI 

! I I I I I 

KLSICVliPVLIi 6IU»I]jQSRV BGPQTESKNB ASSSDWYSP gPQFLBDIQUi SEETKS T ET E 6D 

TGSSVGKLFE A8RIUSTIZ*S NTDHKtiRPGI GEKETWTVB lAVNSLGPLS ILDMEfTIDX 120 

IPeCJTHTDEa LCYNDTPESL VXJIGNVVSQi:, MXPDTFPRKS KRTHEHBITM PNQMVRIYKD IBO 

GKVLYTIRMT IDRiGCSIiHML RFPMDSHSCP L&PSSPSYPE JlEMIYKJfENP KLEIMBKNSW 240 

KLFQFDFTCJVr SNKTBIITTP VGDFMVMTIF PWVSRRPGYV AFOTfVPSSV TTMLSHVSFW 300 

XKTESAPARr SIAITSVLUM TTU3TF&R1CN FFRVfiYITAI* DFYIAICFVF CPCAI.LEFAV 360 

I^LinaaTK AHASPKIi£tHP RIKSRAHART RARSRACARQ HQSAFVCQIV TTB6SDGBER 420 

PSCSAQQPPS P6SPEGPBSL CSiOACCBHC KRFKRyFCHV PDCB66THQQ GRliClHVYSI. 480 
DSnrSRWFPV TPFFFUVLYK liVCUJXi 

8e^ ID ZIOs 611 SUA sequence 
HUcleic Acid Acc«fiSion fti nn_0219a4.1 
Codxng sequence: 572.-17S3 

1 11 21 31 41 51 

I I I I I I 

eOCASaAGGGT GASCCGOGAC CSOTaCOCAO OTGGTCQCGC Ca^rCTCOaC GOAAArrOTTG 60 

TCCAAftGTTC TTCCAGTCXTT CCTAGGCATC TTATTGATCC TCCAGTOSA6 AACATGTAtCA 120 

C2\GAGAAGIG CTCAAATCAT AAGTGTACAO CrGATGAlSTT OTCSUU\AAAT OACCACAGCG 180 

OTGTAAAGAA AfiCCAAATCA A8BACCOGAA T6TGAGCAGG ACCTCAGAAG OCCCCTTTGT 240 

CACTGCCTGC CAGCAAAGGC AGCACIKtOC GGACTTCTAA CACCATGGOe ICGAGGQACC 300 

TCAGACTORA TCAAAGAATQ AAflCCTCTTC C0GTGAT9TT GTCTATGQCC 00CM30CCCA 360 

GCCTCTGGAA AATCftfiCTCC TCTCTGAfSGA AACftAAGTCA ACTGfiSACTG AGACTGOGAG 420 

CAGBU3TTGGC AAACTGCCAG AAGCCTCT06 CATCCTGAAC ACTATCCTGA GTAATTATGA 4B0 

CCACAAACIG C5GC3CCTGGCA TTGQAaai3AA OCiCCftClGrG GXCACTISTTQ AGATCTCCGT 540 

CauUSGCCTT GGTOCICTCr CXATCCTAGA CATGGaATAC AOCATTtSACA TCATCTTCTC 600 

OCAGACCTG6 TAOCAOGAAC GCCTCTGTTA CAAGGACACC TTXGAGlTCrC TIGTTCTQAA 660 

IQQCAATGTG GTGAGCX»GC TATGGATCXX: GGACACCTTT TTTAGQAATT CTAAGAGGAC 720 

OCACGAGCAT GftGATCACCA TGCCCAACCA GATGGTCCGC ATCTACAAGG ATOaCAAQOT 780 

GTTGTAC3«Ji ATTAGGATGA CSCATTGATGC CXSCaATGCTCW. CTCCACATGC TCftSATTTCC B40 

AATGGATTCT CACICTKSCC CCCXXXCXTT CXCa^CtCT TCCTATGCXG AGAATOHOAT 900 

GATCTACAAG TGGGAAAATT TCAAQCMGA AATCAATSRO AAeAACTCXa? G6AAGCTCTT 960 

CCftGTTGGAT TlTACfeGGAG TCAGCAACAA AACTGAAftTA ATCACMCCC CAQTOGOTGa 1020 

CTTCATG0TC ATGAOGATTT TCTTCAATGT GAGCftGGCGG TTTGGCTATG TTGCCITTCA 1080 

AAACTATGTC CCTXCTTCCXS TGACCA£3GAT GCTCTCCIGG GTTXCCOTT GOATCAAOAC 1140 

AGAGTCTQCT CCflGCOOGGA C9CTCTCTAGG GATCACCTCT OTTCTGACCA TGACCA06TT 1200 

GG6CACX:tTT TCTGGTAAOA ATTTCCOQOB TOTCXCCTAT &TCA0U3C!CT TGGAITTCXA 1260 

TATCX30CATC TGCTTOGTCT TCTGCTTCTG OGCTCT6TTG CaVGTTTGCTG TGCTCAACTT 1320 

CCTGATCrCAC AACCAGACAA AAQOCXATOC TTCTCJCTAAA CTCOSCCRTC CTCGTATCJUi. 1390 

1:AG0CX3TQC3C CATGOCCGTA CCOGTGC3WK TTCCOGftGCC TGTOOCGBCC AACATCAGGA 1440 

AGCTTTT6TG IGCCAGATTG TCACCSbClOA GGGAMSlGAT GGAOAGGAOC GCCX^TCITQ 15Q0 

CTCAQCCCAQ CA6CCCCCTA GCOCASSTaB OOCTSRGGGT OOCOGCAGOC TCTBCTOCftA 1560 

GCTGOCCTGC TGTGAGTGGT GC3lAfiOSTTT TAAGAAGTAC TTCOXSCATGG TCCCCQATTQ 1620 

TGAGGGQ\GT ACCTGGCA6C AGGCCOGGCT CTGCATCCAT GTCIACCGOC TGGATAACTA 1680 

CTCQAGAGTT GTTTTCCCaG TGACamCTT CrTCTTCaAT GTGCrCEACT GGCTTGTTTQ 1740 

CCTTAACTTG TftQOTACX»G CTGOTACCCr QTOGGQCMC CTCTCCTGTT CXXnGGAOS IBOO 

TCCAAGCGCC TTGCCAAGGG AGTXGGGGSA AAGCSUSCAGC AGCAGCS^GGA GGGACCAGAS 1B60 

TTTTTCCTGC COCATTCCCC AAAGAGAAGC TTflCaWSAGQQ TTTOTCTTTQ CTGCCCCTCT 1920 

COCSCTACCTG GGCCATTCAC TGAGTTTTCT CAGCASWXA TTTCAAATEA TTAATAAATG 1980 

GGOCACCTCC CTCTrCTTCA AGGAGCATCC G^OATGCCOk GTQTTCAAAA CCACAGOCAC 2040 

TTACTDQATCA GCTCCCTAAA ACCATGOCIA AC3TACAGGC6 GKTTAGCIAT CITGCAACAA 2100 

TGCTGACCAG CAGAC3ATTA CTGC3WTTTTT CCBGAAGCOC AiCTATTSOCT TXaCAGTOCT 2160 

TMQQC3CXaQ TTCTOGGCTC AGCCTCAAAO TCKSWOGACT AGTTGCTTGC CTATAOCTGG 2220 

CACCTCATTA AGATGCTGGG CAGCAGTATA ACAGGAGGAA GAGATCCtTTC TOCITTGGTC 22 BO 

AGAXIATTAT GTTCTCAGTT CICTCECCCX QCTflfSXCTr TCTCTGCAGA TAQATAGACA 2340 

CTGGGATTAT COCTTIAGGA AGAGGGGGOG GCAGCAAGAG AGCCIhTITG GSKCNSCta^t 2400 

CCTCTCTCTC TGCIGCTGIG ACSkTCTOCCT CXOCTTQCra QCTCCATCTT TOQTCTGCAC 2460 

TACCAAT TCA ATGCCCTTC3\. TCJCAATGGOT RXCTA3TTTT GTGTGTGAIT ATAGTAACIA 2520 

CTCCCT6CTT TATATGCS^AC CCTCTTCCTT CTCTTTORCaC CCTOTQACTC TTTCFGTAAC 2580 

TTTCCCAOra ACTTCCCCTA QCCCTQACCC AGGCACTAGG CCTTGGTGAC TTCCIGGGGC 2640 

CAIUSAAACTA AtSGAAACTCS GCTTTGCAAC AGGCATXACT OSCCATTGAT TGGTGOCCAC 2700 

CCAGGGCACA CTGTCGGAGT TCTATCACTT GCTTGACCCC VGGa£!C9CATA AAGCABTCCA 2760 

CTGTIAVAGC CGGOGCACTC arAACCATCAC AATCAATCAA TCAAATTCOC TTAAATTTGT 2820 

AVBGCACTG3 AACTTTGGCA AAGCACTTTT GACAAGTTGT GTCIGAXIGO AGCTXCATGA 2880 

TAGCCTT6TG ACATCTTTAG GGCAGGATTC TTATCCCCAT TTTGCAGATG AAAAOCCTGA 2940 

GTCftCAGATT TCTOTGGGAC TGTGGATCTC ACTGGaaGCT ATCJCAAGAGC OC3lClGTC3kC 30DD 

CTTCTAGAOC ACATGATAGG GCTAGAiCAfiC TCAGTTC3K2C ATGATTCTCT TCTGICACCT 3060 

CTGCTGGCAC A£X!AQTGGCA AOGOOCfiGAA TGGOGACCTC TCTTTAGCXC AATTTCPGGQ 3120 

CCIG&G8IGC TCAGACI6CC CCCAAGATCA AATCTCTCCT BOCTGTAGTA AC<3CAGTGGA 3 IBO 

ATGAATITGG ACKXGCCCCA ATOCTTCrAT ATGCTAAGTG AAATCTCTOT CTGTAATTTa 3240 

TTOGGGGOTa GATAQGQTGG 6GTCTCCATC TACTTTTTaT CACCATO^TC TGAAATGGG6 3300 
AAATATGTAA AaAAATA'XAT CAGCAAAGC 

Sttq ID HO: 612 Protein geauence 
Protein Accession 4s 1IP_Q6B819.1 
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MBYTIDIIFS QTWYDBRIiCY tiJDTFE&LVUI GNWSQIiWIF I>TFFRHSKRT HEHBITZ4PHQ 60 

MVKIYKDGKV LYTIRMTIDA GGSIiHHLRFP KDSHSCPLGF SSFSYSBNEH lYKNEHFKnB 120 

INEKHSTRKBF QLOPTGVSNK TEIITTPV6D PMVMTIFFNV 5RHFGYVAFQ KYVPS&VTTM 180 

LSWVSFWIKT fiSAPAHTSliG ITSVLTMTTL GTFSRKlflFPR VSYITAIJ3FV lAICFVFCFC 240 

MiLEPAVUlF l/rniQTKAHR. SFIORHPRIN SHAHARTRAR SRACARQHQE AFVCQIVTTB 300 

QSDGEBRPSC SAQQPPSPGS PBGPRBLCSK LACCEMCKRP JOKFCMVPDC BGSTWQQftllL 3€0 
CIHVYaiiDMy 8RWFFVTFF FPHVIiinffLVC IiHL 

Seq ID NO: 613 pMA s&quence 
Nucleic Acid AcceBsion «: NM_(}21^a7 .1 
Godlng aequencet 572.. 1657 

1 11 21 31 41 51 

1 1 1 I I I 

GC3CAQA00GT GAGCCGCGAC CTCOGCGCAG GTGGTCGCGC CGGTCTCCTC; GGAAATGTTG 60 

TOCAAA6TTC TTOCAGTCCT CCTAGGCATC TTATTOaTCC TCCAGTCGAG AACA.1QTATA 120 

CfUsUUSAftOTG CTCAAATCHT AAOrGTACING CTGATGAGTT GTCAA2UUULT GACCACAGC6 IBO 

GTGTAAAQAA AGCCAAATCA ASCSACGCSAA. TQTGAGCAOS ACCTCAGAiU& CCCCCTTTGT 240 

CACTGGCTCC CAGCAAAOGC AQCACTATCC GGACTTCTAA CSVCCATCJGGG TCGAGGGAGC 300 

TCAGACTOAA TCAAAGAATG AAGCCTCTTC CCQTQATGTT GTCTATGGCC CCXJW3C5CCCA 360 

GOCTCTGGAA AATCAGCTCC TCTCTGAGGA AACAAAGTCA ACTQAQACTG AGACTGGGAG 420 

OUSAGTTGGC AAACTGOCAO AAOCXTTCTCG CATCCIGAAC ACTATOCIGA OTA&XnVIGA 480 

CCACAAACTG CGCCCTGGCA TTO^OaOAA GCOCACTGTG GrCftCTOTTQ AQATCTCGGT 540 

CaACftGCCTT GGTCCTCrrCT CTATCCTAGA CATG^^AATAC AC3CATTGRCR TCATCTTCTC GOO 

OCaGACCTOa ARTTCTAAGA GCSACO^ACQA QCATCAGATC ACCATGCCCA ACCAGATGGT 660 

CCX5CATCTAC AAGGATOaCA AGGTGTTGTA CACAATTAGG ATOACCATTG ATGOOSGAIXS 720 

CTCACTCCAC ATQCTCAGIAT TTGChA!lGGA TTCTCaCPCT TGCCCTCTAT CTTTCTCTAG 780 

CTTTTOCTAT OCTGAGAATa AGATQATCTA CRAGTGGGAA AATTTCAAGC TIGAAATOA 840 

TGAGAAGAAC TCKTrCGAAGC TCTTOCftGTT TOATTTTACA G6AGTGAGCA ACAAAACTGA 900 

AATAATCACA AGCCKlAGrTQ GTOACTTCAT GGTCATGTVOG ATTTTCTTCA ATGTGAGCRG 960 

GOG gTTTGQC TATGTT6CCT TIC&AAACTA TOTCCCTTCT ' TOCGTGACCA CXSATOCTCTC 1020 

CTGQOTTTCC TTTTGGATCA AiaACAGAGTC TSCTCCAGGC GGOACCXCXC TAGQ6ATC&C lOflO 

CTCTGTTCTG ACCATGAOCA GGTTGGGCAC eTTTTCTOOT AAGAATITCC CXSOOTGTCTC 1140 

CTATATCACA GCCTTGGATT TCTATATOGC CATCTGCTTC GTCTTCTCBCT TCTGCGCTCT 1200 

GTTGGAGtTT QCTOTGCTCA ACTTCCTGAT CTACAACCAG ACAAAAGCCC ATGCTTCXCC 1260 

TAAACTCOSC CATCCTOGm TCAATAGCXXS TQOCCATGCC a3rACCa3TG CACGITCGCQ 1320 

A60CIGTOCC OaCCAACATC AGGAAGCTTT TGTGTQCCAG ATTGICACCA CraAOBQAAG 13 BO 

IGATGOaSAG GAGC96CCCX3T crTGCTCWSC OCAGCAGCCC OCTAQCCCAG GTftGGCCTGA 1440 

GG6TCCCC3GC AOCCTCTGCT CCAAGCTGGC CTGCTGTGAG TGGTGCAAGC GTTTTAAGAA 1500 

GTACETCTGC ATGGXCOCOS ATTGTGAGGG CAGTACCrGG CWSCAQOGCC GCX:XC1GCAT 1560 

OCATGTCTAC OSCCTGGATA ACTACIOSAG AOTTOTTTTC CCM3TGAC£T IXJrA ' Ci ' J .' CrA ' 1620 

CAATOTGCTC TACTGGCTTQ TTTQGCTTAA CTTGJ3VBGTA CC!AGCIGC?rA C C C I Gr GGUM 1680 

CAACCICTCC AGTTCC3CC2^ GlUSOTCOUU? C C C CT TC OCA AGGGAGTTGS GGGAAAGCAG 1740 

CAGCAGCAGC AGGAQGQACT AGAGTTTTTC CTGCCCCATT COCCARACAG AAGCTTOCAG LQOO 

AGGGTTTQTC TTTGCTGCXX: CICTCCXX™ CCTGGCCCAT TCACTGflQTT TTCTCAGCAG 1860 

ittXarrrCAA ATTATTAAXA AATOGGOCAC CTCCCTCTTC TTCTAGGftCC ATCGGTQATQ 1520 

CTCAGTGXTC AAAACCACA6 CCACrTTAGlXa ATCAGCTCGC TAAAAOCArFS CCTAA5TAGA 1980 

GGOOGATTAG CTATCTTCCA ACAATGCTGA CCACCAlQACA ATTACTGCAT TTTTCCAGAA 2Q40 

GCOCACTATT QCCTTTGCAG TGCTTXaSQC CCAGTTCTGG CCTCRSOCTC AAASTGCACC 2100 

GACTAOTTGC TTGCCT ATAG CrOGCACCTC ATTAAGATGC IGGOCROCSVG TATAACAGQA 2160 

GQAAGAGATC OCTCTCCTTT GGTOUSAXXA TTATGTTCTC ASTTCTCECI CCCTGCTACC 2220 

ccTTTC Jcrq cagatasata gacactggca ttatoccttt AOGRAGauaoa gogggcusga 2200 

A<»m3CCTA TOrlGGGACaG CATTCCTCXC mDCTGClOC TGTGAGATCr CCCTCICCTT 2340 

6CTGGCTCCA TCTTTOSTCT GCACTACCAA TTCftATGCOC TTCATCCAAT GGOTATCTAT 2400 

TTTTGTGTGT GATTATAOTA ACTACTCCCT GCTTTATATG CXaOOCTCTT CCTTCTCTrT 2460 

GACCCCTQTS ACTCTTrCTS TAACTTTCC3C AGTOACTTOC OCtAQCCCIG AOCAGGGhCT 2520 

AGBOCCTGQT GACITOCTGQ GGCCAAGAAA CTAAGSAAAC TOSSCTTIGC AACAOGOUTT 2580 

ACIOGOCATT GATTOaTGCC CACGC3U3GGC ACACTGT066 ASSTCTArCA. dTGCTTGAC 2640 

OOCTQOACCC ATAAAOCRjOT CCACTGTTAT ACCOGOeOCA CTCTAACCAT CACAATCAAT 2700 

CAATCAAATT CCCTTAAATT TGTATGGCAC TaOAACTTTO OCAAAGCACT TTTQACAAGT 2760 

TGJ GTCTG AT TGGAGCTTCA TGATAGCXHTT GTGACavrCTT TAGOQCftOQA rFCXTATCCC 2B20 

CATTTT6CAG ATGiAAAAGOC TGRGTCAC2SG ATTTCTCSTOG QACIGTGGAT GTCACIGGAA 3880 

GCTATCCMG AGCCCACTGr CAOCTTCXAO ACXACAIGAT AGGGCTAtlac AGCVCRGITC 2»40 

AC2CKTGATTC TCTT CTOT CA OCTCTGCTOG CACSUCCASTG GCAAGGGCCA GAATGGCGAC 3000 

CTCI CTTTAG CTCAATTTCT GGGCCTGAGG TGCTCAGACT GOOCCCAASA TCAAATCTCT 30 60 

CCTGGCTGTA GTAACCCAGT GGAATGAATT ^XSGACATGCC CCAATGCTTC TAIATGCTAA 3120 

GIGAAATCIQ TGTCTGTAAT TO3TXGQQQQ QTGGAIftGGG TQGQGXCSQC ATCTACTTTT 3180 
TGTCACCATC ATCTQAAATG GGGAAAXATO TAAAXAAATA 1ATCAGGA2\A QC 

Seij ID NO: 614 Protein sequence 
Protein Accession ft: HF_068a22.i 

1 11 21 31 41 51 

. ' ' > J t 

MBYTIDIIFS QTMNSKRTEB HEITMPNQMV RIYJCDGSCVLY TIHMTIDAGC SUMLSPEMD 60 

SHSCFIi9?S8 FSYPSNEMIY ICHENFKLBIN EXKSNXIiFUF DFTQVSeHCTE IITTFV^FM 120 

VMTIFFMVSR RFGYVAFONT VPSSVTTHnS WVSFniKTSS APART8L6IT SVLTHlTDGT 180 

FSRKNFFKVS YITAXJ7FVIA ICFVFCFCAL IiBFAVIiHFZiZ lOlQTKnHASP XLRRPRIHSSt 240 

AHARTRARSR ACSASQHQBAF l/CQIVTTBGS DGEERPSC8A QQPPSP6SFB GPRSLCSKLA 300 

CCBNCKRFKK YFCHVFDCBG 8ZHQQGRLCI KVYRLDHYSR WFBVTPFFP WltYHLVCUr 360 
II 
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PCT/US02/36810 



Seq ID NO? 615 DHR sequence 

Mucleic Acid Accession ftt NM_021990.1 

Coding sequence: 130j&.»a490 

5 1 11 21 31 41 51 

I I I I i [ 

6CCASAGCX3T 6AGCC3GOGAC CTCCX3CGCAG GTGOTCGCGC OGGTCTCOGC GGAAATG7TG GO 

TCCAAAGTTC TTCCAOT'CCT OCTAGGCATC TTATTGATCC TCCAGTOGAG AACATQTATA 120 

CAORSAftCTG CTCAftATCAT AftOTGrnCAG CTGATGAGTT GTOUOfflAAT GAOCACAfiOG IBO 

lU 6TGTAAAQAA. AOCCAAATTCAi AGGACOCGAA TGTGA6CAG6 AOCTCAGAA6 CGCCOTTTGT 240 

CACTGCCTCC CAGCAAAfiec AOCRCTATCC GCSftCTTCTAA CACXJUCTOT GA6TTTC3VTA 3 00 

CCTTGtSCAOA TQGCCTTTAA CATTTTTGTT TAATTCMTT ATTCTTACTA ATCTTCTTCT 360 

TTTTCTTGGC TGTGGTGCAT GQCTGTGGAa CTCAQGOTQG ACTCCTGTTG GGCAGCCfUST 420 

TCCTGGATQQ CTGTCTGTGG GTGGAGGftCT CCTGCCTTTC CTGITTAGAC ACCCACAAAG 480 

ID aCTGCrCTTT AGOCTCCTTC CCTTCRTOCC CTTCC3CCTOC CCCX3U?rGCA AOGMSTATTA S40 

CACAACCAAC AAAACC3C3<ZRA AATATTOCCA CAATTTTCTG GTOCTCTCTG QGAGAGGCOa 6P0 

CTCTGGCTTT TCCTCTCAGC CCTQQCCCrc TGCCTGCTCC TCACTCCTGG TTQGTGCTGG €60 

TCAGGC30AC TAGAQQCCAA GGOGACCAAC ACTAGGCAAA CGCGGCCAGC QCTCftSACAT 720 

on AAATSOCCTC TTChTTTCAC GVeXAAClhTT CXTTTAAAKT CTAQOTCTTG GTTTTGTTGA 780 

/U TXXTTTCTXA AATAAAAGA6 TGATCKTAAA AGAGGQAQU3 CATAfiAAAGT CCCCAAAGAG 840 

CA3C3UW3GTT TTAAAGAAAT TGACAAGCCT AATCTGTCAC TGTCTTATAA TTTGCTATTA &00 

CCAGXGAC3UW TTTAACTAGG TTTTGTGTTG AAAACTTGTT TTGGTTTQCT TCTaTCCCAA 960 

GAfSGCACTAG CTGOOGCCCC TACAiGAGTGC AGGGCASAOC TTCATTTTTC GTTTGAATGT 1020 

^ _ TCrAOQSTOS AGGOACCTCA. GACTGAATCA AAGAATGHAG GCICTTOCGG TGATGTTSTC lOBO 

ZJ TATQGCCSOCC AGCCCCRGCC TCTGGAAAAT ChGCTCCTCT CTOAGQAAAC AAAGTCAACT 1140 

GAGACIGAGA CTGGGAGCAa AQTTGGCWiA CTGCCAGAAG CCTCTC3GCAT OCTGAACACT 1200 

ATCCTQAQTA ATTATGACCA CAAACIGCGC CCTGGCA'PTG GAGAlSAAGCC CACTGTGGTC 1260 

ACTGTTGAGA TCTOCGTCAA CAGGCTTGGT CCTCTCTCTA TCX^TAGACAT GGAATACACC 1320 

ATTGACarCA TCTTCTCCX^A GACCIGGTAC GAGGAACGCC TCTGTTACAA OGACAC2CTTT 1380 

GAOTCrCTTG TTCTGAATGG CAATGTGGTG AlQCCnaCXAT QQATCCCGGA CACCTTTTTT 1440 

AGGAATTCTA AfiRGOACCCA CGAGCATGAG ATCACCATGC CCAACCAGAT GGTCCX3CATC 1500 

TACAAGGATG GCAAG(3TGTT GTACiVChATT AGGATOACCA TTGATQCCGG ATGCTCACTC 1S60 

CACRTGcrCA OATTTCXMT GGATTCTCAC TCTTGCCCTC rATClTTCTC TAGCTTTTCC 1620 

- _ TATO CTGAG A ATGAGATGAX CXACa^AGTGG GAAAATTTCA AGCTTGAAAT CSU^TGAGAAS 1680 

AACXCCTGGA AGCTCTTCCA GTTTGATTTT ACAGGAGTGA GCAACAAAAC TGAAATAATC n40 

ACAAOCCCAG TTGGTGACTT CAXGGTCATG ACQATTTTCT TCAATGTGAG CAGGCGGTTT 1800 

GGCTAXGTTG CCTTTCAAAA CTATOTOCCT TCTTOCGTGA CCAGGATGCI CTCCTGGOTT 1860 

TCCTTTTQGA TCAAGACAGA GTClGCrCCA GCCCGGACCT CTCIAGGGAT CACCTCTGTT 1920 

^ CtGRCdOOtk. CCftaar TGOg CAOCTTTTCT GGTAAGAATT TCOOGCGlGr CTCCZATATC 1980 

4VI AC3U9C3CTT06 ATTTCTATAT CGCCATCTGC TTC3GTCTTCT gCTTCTG O BC TCTGTTGGAG 2040 

TTTGCTGTGC TCAACTTCCT GATCTACAAC CAGACAAAAS CCCAMCTTC TCCTAAACTC 2100 

CXSCCATOCTC GTATCAATAG CCGTGCCCAT QCCOQTACX:c GTGCAOGTTC COGAGCCTGT 2160 

GOCOGCCAAC ATCAGOAACC TTTT6TGTGC CAGATTGTCA OCACTGACGO AAOTOATGOA 2220 

. ^ GaggnGOBOC ogtcttgctc ASOcxnacAS cxxscctagcc caqqtaggcc tgagggtccc 228O 

'fD OGCawaCCXCr GCTCCAA^CT GGCCTGCIGT GAGlGGl^aCA AOGQTTTTAA GAAGTACTTC 2340 

TGCATGGTCC CCXSATTGTGA GQGCAGTACC TGGCAGCAGG GOOGOCTCrG CATCCATGTC 2400 

TAOCGCXTTGa ATAACTACTC GAGAGTTGTT TTOCCAGTGA CTTTCTTCTT CTTCAATGTG 2460 

CTCTACTQGC TTGTTTGCCT TAACTTGTAO OTACCftGCTS GTACOCTOT6 GGGCAAiCCTC 2S20 

TOCMSrxCCC CAOGAOSICC AAGGCCCIT6 OC3U^GGGAGT IGGGSGAAhO CAC3CAGCA0C 2580 

3vl AGCftOGAGOB ACTA6AGTTT TTCCTGCCGC ATTUCUCAAA GAGAAGCTTG CAGAjGGGTTT 2640 

GTCTTTGCTG OCX:CTCTC0C CTACCTGGCC CATTCACTGA GTTTXCTC3K5 CAQACCATTT 2700 

CAAATTAITA ATAAATGGGG ChCClCCCTC TTCTTCAAOa AGCATCOGTG ATGCTCAGTG 2760 

TTCAAAAOCA CAGCCACTTA GTGATCAGCT OOCTAARAGC ATQCCTAAGT ACAflQCBGAT 2820 

TAGCTATCTT CCMCAAT6C TGAOCACChG ACAAXTACIG GATTTTTC9CA GAAiSCCCAGT 2880 

DD ATTOCCTTTG CMS/tGCSmC GGC3CX3kaTTC TOgCC rC AGC CTCAAAGTGC ACOSACTAGT 2940 

TGCTTGGCXA TACCI6GCAC CTCATTAASA TGCTGGGCA6 CIVSTATAACA GGAGGAAGAG 3000 

ATOCCTCTCC TTltSGTGflaA TTATTATGTT CTCftGTTCTC TCTCCCXGCT ACCCCTXTCT 3060 

CTGCAGATAO ATAQACACIG GCATTATCOC TXTASGAAGA GGGQGGQGCA GCAAGAGAGC 3120 

CTATrroOGA CAGCATTCCT CTCTCaJCTQC TOCTOIGACJA TCTCOCICIC C n - GOl GSgr 3180 

DU CCATCTTTOG TCTSCACTAC CAAITCAAT6 OOCXTC i aCC AATOGGISaC TATTTTTOTG 3240 

TC3TQATTATA GTAACT ACTC CCTGCTTTAT ATGCXAC3CCT CTTCCTTCTC TTTGaOCCCT 3300 

GTGACTCTTT CXSTAAfnTTT COCAJGTGACT TOCOCTAGCC CTGACCAGGC ACTAGGCCTT 3360 

GGIGACTTCC TGGGGCCAAG AAACTAAGGA AACTOSfiCTT TQCAACauaaC ATTACTOGCC 3420 

ATTGATrGGT GOCCAOCCRQ QGCACACTOT CSGGAOTTCTA TCAdTGCTT GMXXX^TGGA 34B0 

CCCATAAACC AOTCCACTGT TATACOC3GGS GC!ACrCTAAC CAICACAATC AATCAATCAA 3540 

ATTCOCrrAA ATTTGtCATGG CACTQQAACT TTGGCAAAGC ACTTTTGACA AGTTGTGTCT 360O 

QATTGOaQCT TCATGATAGC CTTGTGACAT CrrTAGOGCA OGATTCTTAT COCCATTTTG 3660 

CAQATGAAAA CCCTGAGTCA CAGATTTCTQ T0GGACT3TG GATCTCACTG GAAlGCIAXOC 3720 

„ AAGAGCCCAC TGTCACCTTC lAGACCACAT G»TA8GGCXA GA<3U9CTGAG TTCACCA1GA 3780 

/U OTCTCTTCIG TCACCTCTeC TGGCACAGCA GIGGCAAGGC CCAGAATGGC QACCTCTCPT 3840 

TAGCXCAATT TCTGQGCCT6 ASGTOCTCAB ACTGCCCGCA ASATCAAATC TCTCCTGGCT 3900 

QTAQTAAOCC AQTQQAATGA ATTTGGACAT GOCCXaATGC TTCTATATGC TAAGTQAAAT 3960 

CIGTGTCTGT AATTXGTTGG GGGGTGOATA (^OIGGGGTC TCCATCTACT TTTTGTCACC 4020 
ATCATCTGAA ATGGGQAAAT ATGTAAATAA ATAXAICAGC AAAGC 



75 



Seq ID MO: 616 Protein qeqttence 
Protein Accession #j in»^068830.l 



o/\ ^ ^1 21 31 41 51 

SO 1 1 I lit 

hEKTIDIIBS QTWYDKRWJY KDTFESLVUI GWWSQWJIP DTFFENSKRT HSHEITMPMQ 60 

MVRXnroGICV LYTZSHTIDA GCSIjHHIiKFP MDSHSCPLSP SSPSYPBdJEM IYKWENPKLB L20 

IMEKHSW ICLF QFDFTGVSHK TEIXTTFVGD FMVKTIFSHV SRRFGYVAFQ NYVPSSVTTO ISO 

LSPfVSFHIKr SSAPARIfiliS IXSVLTMTTI. GIFSRKNPPR. VSTITAIJJFY lAICPVPCFC 240 

1188 



wo 03/042661 



PCT/US02/36810 



ALLEFAVLNP LIYNQTKAHA SPKLRHPRIN SRAHASTBAtl SRACARQEQB AFVOQIVTTE 300 

GSDQEERPSC SflQQPPfiPGS PEGPRSLC3K LACCEHCKaP KRYFCHVPDC E6STHQQGRI> 360 
CIHUTCHIJJNY SaWPPVTFF FETIVLYWiVC l^Ox 

5 Seq ID MOs 617 PKA sequence 

Hudeic Acid Accession #s nH_004Bfi4.1 
Coding aecpience: 26, .952 

lA " 2^ 31 41 51 

CGC3AAC1GAGG GCAACCTGCA C3W3CCATGCC CGGGCAAGAA CTCSIOGACGG TGAATGGCTC 60 

TCawaATGCrrC CTGGTGTTGC TOGTQCTCTC GIGGCTOCCG CATGGGGGCQ CCCTOTCTCT 120 

GQCaSAGGCG AGOCGCOCAA GTTTCCCGGa ACCCTCftiSAG TTQCACXCOG AAGACTCCRG 180 

^ ATTOCGAGA6 TIGCQGAAAC GCIACSAGGA CCTGCTAACC AGGCTGCGOQ CCAACCflGAQ 240 

ID CTGSGAAGAT TOGAACAGGG ACCTCX3TCCC QGCX3CCTGCA GTCCGGATAC TCACGCCAGA 300 

AGIGCGGCTG GOATCCGGCG GCCACGTraCA OCTGOQTATC TCTOGGGCCQ CCCTTCCCQA 360 

GaGGCTCX3CC GAGGCCTCCC GCCTTCACXK QaCTCTGTTG CQGCTGrCCC GGACQGCGTC 420 

AAGGTCGTGG GACQTGACAC GACCGCTtHCG GCGXCADCTC AGCCTTGCAA GACCCCAAQC 480 

^_ GCOGGOSCIQ CAOCTGCKAC TGTCGCCGCC GCQOTOGCAG TCDOACCAAC TGCTGGCAC3A 540 

ATCTTOGTCC GCACGGCCCC AGCTGGAGTT GCACTTOOGG CCGCRAQCOG CCAGGGGQCG 600 

CCGCAGAGCG CGTGCGCQCA ACGGGGACOA CTGTGCGCTC QOGCOCGGGC GTTGCTGCOS 660 

tCTOCACACG GTCJCaOGCGT CGCTOQAAGA CCTGOGCTGG GOOGATTOOa TCGCXGTC5C3CC 720 

AOGGGAGGTQ CSWSTGACCA TGTGCATCGS CGC3GTGCCCG AGCCAGTTCC OGGCQGCAAA 780 

CKnaCACGOG CAGATCAAGA CGASCCTOCA CCGOCttQAAG OOOGACACSGQ AGCCAGCGCC 840 

Z3 CTGCIGOGTO CCCGCC3WX2T ACAATCOCAT GarOCTCATT CftAAAQACCS ACACaQGG<3T 900 

6TGGCTCCAG ACCTATGATG ACTTGTTAac CAAAGACTGC CACTGCATAT GRGCA6TOCT 960 

□OTOCTTCJCA CTGTGCACCT GDGC5GGGGGA GGCmCdCA GTTGXCCTGC CCTGTGGRAT 1020 

GGGCTCAAQQ TTGCCGAOAC ACCOGATTCC TGCXX3UVACA QCTQTATTTA TATAAGTCTG lOBO 

'^n TTATTTATT& rTAATTlATT GSGGTQACCT TCnOGOGAC TCGGGGGCTQ GTCTGATCSGA 1140 

ACTGIGTATT tCATTTAAAAC TCTGC3T6ATA AAAATAAAGG TOTCTGAACT GTTAAAAAAA 1200 
AAAA 



35 



45 



80 



s«q ID MOs 618 Protein aeguence 
Protein Accession #: HP 0048S5.1 



Seq ID MOs 619 DNA Sequgnce 

Nucleic Acid Acceselon 2IH_0Q3979.2 

Ooding sequence 254.. 1357 



1 ' 11 21 31 41 SI 

I I I I t I 

HPGQBLRTVN GSQHLLVliIiV IiSWLFHGGAL SIiAEASRASF POPSELHSBD fiRFRKLRKay 60 
BOUABUUSI QSHEDSN1I>Z> VPAPAVRUTT PEVRLGSGQH IBLRISRAAI. PEX5IrPEA3HI. 120 
HKl HRAIf HliSPT ASStSHDVTRP XiSRQLSLARP QAPALHLRLS PPPSQSDQI*Ii ASSSSARPQL 180 
EEtHIiRPOAAR GRRRASAIQIG DDCPUBPGRC CRIATVSASIi EDLGtfADHVI» SPREVQVTMC 240 
ZGACPSQFRA AHIffiAQIXTS LHRLKPDTEP APCCVPA8VM PHtflilQKIDT GVSLQTYDDL 300 
lAKDCHCI 



jzn } 3^ SI 

50 r I I I I t 

ATAACSMSCAT GAAQTGC0C3T GGAACTGGAA TAGGCGTGTC CTCTCCETCaa AOCCTCCXXC 60 

TCCTTGK30C TCTGCTCRCC CCTCGCTOST TCCCTOCGTC CQGCGAGQGC CJQCCTTTATA 120 

A^ACTOCTC AGflGTgOGftG GGCGaOATAG CTGtC(2RA0G TCTOCCCCAQ CACTGASGAG 180 

^ CrC0OCTGC!r OOCCrOmC QCGOSGGAAS CAGCAGCAAS TTCAOGGOCA ACGOCTTQGC 240 

Dj ACIAGGOTCC A<mTGGCIA CAACAGTIXC TGATGGTTGC GGCAATOGGC TCAAATCX3UV 300 

GTACTACAGA CTTTSTGATA AOOCTGAAGC TXGQGGCATC GTCCTAGAAA OGGTQGOCAC 360 

AGCXSGOOaTT GTGACCTOGG TGGCCTTCAT GCTCACTCTC OOSATCCICO TCTGCAAGGT 420 

QCAGGACTCC AACRGGOSAA. AAATGCTGOC TACTCRGTTT CTCTTCCTOC TCGQTOTOTT 480 

OGGGRajCTTT GGGCXCACCT TOGCXarPCM CRTCQQikCTG GAGGGGAeCA CaVGGGOCXaw:: 540 

OV ACGCTTCTTC CTCTTTGOGA OrCCrCTTTTC CATCIGCTTC TOCTQCCTGC TCGCXCRTGC 600 

TGTCAGTCTG AOCAAGCTCG TCGGGGGQRG GAAGCOCGTT TCCCTGrTGG TOATTCIGGG 660 

TCrGfiCOGTG GGCTTCAGGG TAGTCX5PU3GA TGTTATCaCT AT3GAATATA TTGTOCTGftC 720 

CATGRATAOQ AOCAACGTCA AT6TCTTTTC TOBGCTTTGG GCTCCTCGTC GCftATQAAGA 760 

CTXXGTCCTC dGCTCACCT AOGTOCTCTT CITGArrGGGG CIGAOCXTCSC TCATGTGCTC 840 

Oj crrrcAocTTC totcoxtccp TcaasoGGiG eaAGAsacAT ggggoccaca tctacctcac 900 

GATGCrCCTC TCGRTrTQCCA TCTGGOTQGC CTGGATCaCC CTGCTCATOC TTCCTGACTT 960 

tOACOGCAGG TT3GGRTGACA OGATOCTGAG CICOOOCTTG GCIGCGAATG GCTGGGTGTT 1020 

CCTGTTGGCT TATGnCAGTC OCGASTTTTG OCTGCTCACA AAGCAAGGAA ACOCCATGGA 1080 

„ TTATOCXGIT GAGGATGCXT TCIGI^AAACC TCAAGTCBT6 AAGAAGAlGCr ATGQTOTGGA 1140 

/U GAACAGAGCC TACTCTCAAG AGGAAATCAC TCAAQGTTTT GAAGAGACAG GtSGACACOCT 1200 

CTATQOCCCC TATTOCACAC ATTTTCAOCT GCAGAACCAG CCrCCOCAAA AGOAATTCTC 1260 

CATCITOCGG GCCCACGCTT GGOCGAGOOC TEACAAAOAC TATGAAGTAA AGAAAGftGQG 1320 

CAGCTAACTC TGTCCTGAAQ AOTGGGACAA ATGCAGCCGG GCBGCAGATC rCRGCGGGAGG 1380 

TCAAAGGGAT GrGOGGGAAA TCTTQAGTCT TClGASAAAA CTGTAQkAGA CACTAOQGGA 1440 

/-> ACAGTTIGCC TCCCTCOCRG C2CTCAACCAC AATTCTTCCA TGCTGGGGCT GATGTGGGCT 1500 

AOTAAGACIC CAGTTCTTAG AGGCGCTGTA GTATTTTTTT TTTTTTGTCT CATOGTTTOa 1560 

ATACrrCTTT TAAGTQGGAG TCTCAGGCAA CTCAAGTTTA GACCCTXACT CTTTTTGTTT 162 0 

GTTTTtTGAA ACAOSATCTT GCTCTGTCAC GCAGGCTTOA GTGCAGTGGT GOGATCACAG 1680 

CC3CBflTGa«3 CCrOGROSkC CTOTGCTCAA QCAATCCTCC GAICTOCATC TCCCftAACXG 1740 

CTGGeATGAC AGGGGTGAGC faCRGCTCOC AGCCTAGGCC CTTAATCnG CTGTTATTTT 1800 

CCATGGACTA AAGOTCrGGT CAICTQAGCI CA03CTQGCT CACACAGCTC TAGGGGCCTG 1860 

CTGCTCTAAC TCACAGTOQa TTTTGTGAGG GTCTG1X5GCC CAQAGCAGAC CTGCATATCT 1920 

GAGCAAAAAT AGCAAAAGCC TCTCTCBGCC CACTOGCCTG AATCTACACr GGAAG<X»2U: 19B0 

TTGCTGGCRC OCOOGCTCCC CAACOCTTCT TGCCTGGGTA QGAGAQGCTA AAGATCACCC 2040 

1189 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 

f 

70 
75 
80 



TAAATTTACT 
ATTCACfijOGT 
TAATCTCCCC 
CTCCTTGTCA. 
CTCATCTTGC 
GTGGGCATGQ 
GCAATATUUSA 



CATCrCTCTA 
CACCCCTCTC 

GGAGAATTTG 
TCTGGTGOGC 



GTQCTGCCTC 
ITCTTGCACT 
C3C»iQ(«2^TTC 
TAGATCRiTTC 
CrexAAATAQ 
GAfl BAGTG TT 
ACrCTTTCAT 



ACATTGGGOC 
GTCCCCAAAC 
TTTCRJSACCT 
TCACTTCAAA 

CKTTGTATAA 



TCAGCAOerC 
TTGCTGTCAA 
CACTASCACA 
rrCCTGGGGC 
TTTAGGQCTQ 
TAACTTA^XC 
GCAAAAAAAA 



CCCAGCACCA 
iTCCGAtSATC 
AQCCXIGGTTG 
TGATACTTCT 

cattctotaa 

AOCTGAGTAT 
AAAAAA 



5eq ID KOs E2D Protein aeiiueiicc 
Protein Accession HP 003970,1 



1 

I 

KATTVFDGCR 

KI.VRGHXPI.S 
LTYVLPLMAli 
I]£>TII>SSAIiA 
SQEBITQGFE 



11 
I 

NGLKBKYYKL 
PLLGVLQIFD 
LIiVILGLAVG 

ANGWVFLLAY 
filVSOTIiYAPY 



21 31 

I 1 
CDKABAWGIV LETVATA6W 
UTFAFZraU7 OSTQPTAErPIi 
PELVQDVIAI BYIVLTMNRT 
QSFTOHKIIHG JUlIIYIilMiajS 
VSFEFWLLTK ORNPHDYPVE 
SlHFQIiQEIQP PQKSF&XPRA 



Seq ID HD: 621 DMA sequeime 
Nucleic Acid Accession #t MM_002423.2 
Coding sequence i 4S . . B51 



1 

I 

AJCCAAATCAA 
TGCTGTGTDC 
GAOGCRTGAG 
ATGACICAGA 
TCTTTGGCCT 
CCRGRTQTGQ 
CTTCCAAAGT 
TGSAICJGATT 
GGAAAGTTGT 
ACTCClAiOCiC 
OTCTCGGAGG 
GGAITAACTT 
CCTCTGATCC 
AACTTTCOCA 
QAAAGAAATA 
TGXTGCACAA 
CT!mCTTA.TX 
ATGGlt?rGAC 
AITGTTACKTA 



11 

1 

CCATAGGTCC 
TGTGTOCCTG 
TGAGCTACA6 
AACAAAAAAT 
ACCTATAACT 
AGTOCCASAT 
GGTCACXnrAC 
AGTGTCAAAO 
ATGGGGAACT 
ATTTGATGGO 
AGATGCTCAC 
OCIGZATGCT 
TAAT6CAGIG 
GGATGATATT 
GAAACTTCAG 
TCaOAATTQA 
GCA6TTGGTT 
TGTGTCTTAT 
CaCAAAX-AAA 



21 
I 

AAGAACAAIT 
ctgcctggca 
TGGGAACAGG 
QCCAACAOTT 
GGAATCTTAA 
QTTOCAOAAT 
AGGATCGTAT 
GCTTTAAACA 
GCTGACATCA 
C(2AG6AAAOi 
TTOGA1GAGG 
GC3UUCTCATQ 
AT6TATGCAA 
AAAGOOCTTC 
GCA6AACATC 
TAAGCACTQT 
TTTGAATGTC 
TCCATdSlTG 
TAAAAT6XTT 



31 
1 

GTCTCTGGAC 

CTCRGGACTA 
TAOAAaCJCAA 
ACTOOOSOGT 
ACTTCACTATT 
CATATACTCXS 
TGTCGOGCAA 
TGATTGGCTT 

aacToacTCA 

ATGAAGGCIG 
AACTTGaC2CA 
CCTATGGAAA 
AGAAACTATA 
CATTCATTCA 
TCCTCCACTC 
TTTCACTOCT 
AjBCITTGTCA 
ATTOCAIGGT 



41 51 

I I 

TSVAPWLTLP ILVCKVQDSM 
FGItiPSlCFS CIiLAHAVSI>T 
SnmVFSSLSA PERNEDFVLIi 
lAIWVAWITL IiMIiPDFDHRW 
DAFCKPQLVK KSYGVENRAY 
HAMPBPVKDY 



41 
I 

GGCAGCTATG 
QCCGCTQCCr 
TCTCAAGAGA 
ACTCAAGGAO 
CATAGAAATA 
TCCAAATAGC 
AGACTTACOG 
AGAGATCCCC 
TGOGC6AGGA 
TGCCTTTGCG 
GA OGGA TGGT 
TTCTTTGGGT 
TG6AGATOCC 
iJGGAAAdAGA 
TTCATTGGAT 
CATTTABCAA 
TTTATTGGTT 
OTGCGCGTAB 
AAATTTA 



SI 
I 

CGACTCAC(3G 
C3U3GAGaCGG 
TTTTATCTCT 
ATGCAAAAAT 
ATGCAGAAGC 
CCAAAATQOA 
CATATTACAG 
CTGCATTTCA 
GCTCATGGGG 
CCTGOQACAG 
AGCAGTCTAG 
ATGGGACATT 
CAAAATTTTA 
AGTAATTCAA 
TGTATATCAT 
TTATGTCACC 
AAACTOCTTT 
ATGTC9UVTAA 



1 
I 

AAACAGGAAA 

caggaaktag 
attctctcaq 
aogtggcigc 
ctgtgttggt 

TGGGC3CTGTA 
lia i i ' AljT CgAA 
GGGGCATCCA 
TiXTL'UCTAGG 
TGTGSGTTXAT 
QCCTGCTGTC 
TQVTACTCTT 
CTTAOGCCTG 
TOCTGG2U5ZT 
AATCAAGCAA 
CAOAACOQGir 
ACAGAmCTO 
CCAIAACOCA 
TCTCACCTTC 
AGRGAGATTT 
TATG'IUUGCA 



11 
I 

TAAATAOGAA 
AGGACTTC3QG 
GAAAAAAAAC 
7G6CAGAGCA 

acTroQCACoc 

TCCAAACAQC 
TGTGA&IGGG 
GATCATCATT 
G6AATACCTG 



21 
1 



TOGGAGTTTG 
CATCACA6AT 
GGGTGTGAAC 
TGGCAT06GA 
TGTGAGTGTC 
GACCTCACCA 
GAAQCAICTT 



ATTCTTCAAT 
TAAACAQATG 
TCX]IAGOCTCr 



AXCCXTCXAA 
AAGGTCOCCA 
AGCATGAATT 
CACAATGOrr 



CAGCCTGTGC 

GGCxarooCTc 

TCTAITTCAT 
TCTTCrCTCOQ 
OGCTTGAACA 
CTAAGTATTC 
CCTOGJU^TOG 
TGCGCATCTT 
ATCTATCCAA 
OCAASTTATT 
TCACTGGGAC 
GTrCTQACAG 
TCAGTCTAGG 
TTAACCAAGA 
OGGaCCTTTOG 



31 
I 

CTdAAGCAO 
CC9CIACCACC 
CAGCAAAGAA 
CGATGACTTC 
ATCCXGTGAC 

AGAAAGCTCT 

ACAToaocxn' 

TCTAOGGAGG 
TGGCABCAGA 
TCXaTOkGTGC 
GOCACGOUCA 
OGATTTCTGG 
CCabClTTGG 
ACATCTATGC 
CCAGIGIUGAT 
CAAAAOAAGT 
CIGAGGAAAC 
AAAOCATGCZ 
GGGACTOCCT 
CAOUCACACA 



41 
1 

CATGTAACCT 
CAACTGGCOC 
AAGGAATACG 
AGCAGXrCCXv 
C3CCAGGAATT 
TCCTQQaAAC 
GAAAGAAGGC 
CGQCTCCaTC 
CTTTCOCTTC 
AAATCAOCCA 
AATCIGCICT 
TGESCXAiCGGC 
OGT6CTGCTG 
CTGCC3V3TTQ 
AGCAAACCCA 
CCAA0CAAAT 
CCTCCTOCCT 
GTCTCTOCCA 
QTTTCTCTAT 
AG GGCaC AlG 
TTCGTGTGCr 



2100 
2160 
2220 
2260 
2340 
24O0 



Seq ID SIO: 622 Protein aegraence 
Protein AcccBaion #s 19P_D02414.1 

1 11 21 31 41 51 

I I 1 I 1 I 

KRLTVIiCAVC I^LPG&IiAIiPL POBAGGKSEL QHEQAQDYIiK RFYIiYDSETK lIANSIiBAKIrK 
04QiEFFGIiPX TGHCKSRVIS IMQKPRGGVP DVABYSIiFPir SFKffTSICWT YRIV8YTRDL 
PHITVDECLVS KKLBMHi^CBI P£aPRKWH6 TADXHIGFAR GftHXaTSypFD GFGHTIAHAF 
APGTGU3GDA HFDEDBRHTD GSSLGXilVTiy AATHBDQEBSEi GH^ESSPFigA VMYPTYGNGD 
PQt9FKI«SaDD I1Q6IQKL,YGK. S8KS&KK 

Seq ID NO; 623 DMA secflieaace 
Nucleic Acid Acceasico ffs 10HJI31457.1 
Qcsdliig sequences 204.. 956 ^ 



60 
12 0 
180 
240 
300 



60 
120 
180 
240 
300 
3 SO 
420 
480 
540 
600 
6€€ 
720 
780 
840 
SOO 
960 
102O 
lOBD 



60 
120 
180 
240 



51 
I 

GGCCTGCATC 
CAG1!A£!ATTC 
ATCAAGAGAT 
GTGGCCAATX 
AT6TCTCAOG 
CCACCTAGTT 
AAAAOCrrGG 
ATGGQQACQQ 
TGGGGAGGCT 
TAJ"l'Cl'i*ATT 

QACTATTATC 
GTCITCTGCC 
OTCTQCTGTC 
GTGATCACCC 
AAGTAAGGCT 
TTCTGSGCTT 
CrGTTTGTAC 
CAAGAAGAAG 
CATCABCAJCA 
ctgctg^tg 



60 
120 
180 
24Q 

300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



1190 
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TGAiGCTTGTS GGTETAOAGGA ACRAATATCT AGACATTCAA TCTTCACTCT TTCflATTaTG 1320 

CATrovrrrA ataaatagat actgascatt caaaaaaaaa aaaaaaaaa 

_ Seq XD NO; 624 Protela seouenee 
H> Protein Accession #: lffF_ll3645.1 

1 11 21 31 41 51 

I I I I I I 

MHBMTSAVEV AHSVI.WAPH WCTJPVTPGIM SHVPLYPK8Q PQVHLVPGHfP PfiliVSNVNGQ 60 
iU PVmCALKEGK TIjGIAIQIIIG ZiAHXGLBSIM ATVnVGSVIig ISPyOGTFFH G6LWFII&GS 120 
LSVAAENQPY SYdMGSLG LWIVS&ICSA VGVILPrCDL SIPHPYAVPD VYPYAWSOTIP 180 
GMAISGVLLV PCLLHPGIAC ASSHEGOQEiV C0Q5SNVSVX yPl!IIYAA39FV ITPBPVT5P9 240 
SYSSEXQAHK 



15 



35 
40 
45 
50 



Seg Il> NO: 625 DKA gequencs 

Nucleic Acid Accee^loa NM_0Q5221.3 

Coding sequences l..a70 



1 11 21 31 41 SI 

ATGACAOGAQ TGTTTGACAG AAGGGTCCCC AGCATOCGRT CCGOCSaACTT GCAAGCTOGG 60 

TTCCAGAOGT CCGCA3C3CTAT QCACCATCCG TCTCAGGAAT OGCCftACTTT GCCCGAGTCT 1X0 

Ta«3CTACC3e ATTCTGACTA CTACAGCCCT ACGGQGGGAG CCCOBCTkCOS CTACTGCTCT 180 

CCTACCTGQG CTTCCXA1X3G CAAAOCTCTC AACCCCEAOC ASTATCAGTA TCACGGCGTQ 240 

jfcO AACGGCTCCG OCBOOAacrA CCCAGCCAAA GCTTATGCCJG ACTAlaWSCTA OGCTAGCTCC 300 

TACCaCCAGT ACGGOGGCGC CTACRACCGC GTCCCAAGCG CO^CCAAGCA GCCAGAGAAA 3fi0 

GAAGTGAOCG AGCCCQAGQT <3A{3AATGGTG AATGGCAAAC CAAAGAAAGT TCOTAAACOC 430 

AGGACIATTT ATTCCAGCTT TCAGCTGGGC GCATTACAQA GAAOOTTTC^ GAAGACTCAG 480 

TACCTOGOCr TGCCOGAACG CGCCGAGCTa GCCGCCTCGC TGGGATTGAC ACAAACACAG 540 

31/ GTGAAAATCT GSTTTCAGAA CAAAAI^ATCC AAGATCAAGA AGATQVXGAA AAAOGGGSAG &00 

ATGCCCC CGG AGCACAGTCC CAGCTCCAQC OACCCAATGG OGTGTAACTC GCXSGCftSXCT 6CO 

CCRGCQOTOT GGGAGCCCCA GGGCTOGTCC OGCTCGCTCA QCXZACCACOC TCATGOOC3U: 72« 

CCTCOGACCT CCAACCAC3TC CCCAQGGTCC AQCTAGCTGG AGAACTCTGC AICCTGOTAC 780 

ACAASraCAG GCaeCTCAAT CAATTCCCAC CTGCOQCCXSC GGGGCTCCTT ACAGCACCGG 840 
CTGGGGCTOG OCTOCQOGAC ACTCTATTAG 



3eq ID JIOs 626 Protein sequence 
Pxotiein Accession #: lfP_005212,l 

1 11 21 31 41 51 

i 1 1 I I I 

MTGVEDKilVP SJRSGDFQAP PQTSAAMHHP BQESPTLPES &AISSDYY5P TGGAPBGYCS €0 
PTSASYGKAL NPYQYQVHGV UGSAG9YPAK ATfADYSYASS yHQYGGAYJIR VPSATNQPBK 120 
KVTEEGVBM7 NGKPKKVRKF RTlYSSFQiLA AliQRRPQSTQ VIAIiFEaAEL AASLGI/DQTIQ IBO 
VKIfVFQIIKRS KIKKZHKNGB HPPESSPSSS DEMACNSPQ8 PAVWBPQS9S KSDSHHPHAH 240 
PPTSHgaPAS SYLSHSASWy TSAASGINSB If PPGSJiaBP IAIA8GTLY 

Seq ID NOt 627 EHA sequeae^ 
Nucleic Acid AcceSBlcn #s HH_014420 
Coding sequences 118.. 792 



1 11 31 31 41 51 

I I 1 I I I 

f - GCICTRG aGA CGAS3QXaCTG A0CIGCCAGC TTAGTGGAAG CTCTGCTCTO OOTGGftQAGC 60 

^3 AGCCTGOCTT TGGTGAOGCA CAGXGCIGGG ACCCTCCAOG AGOCOGGGGA TT6AAGGATQ 120 

QTGGOGGCOG TOCTGCTGQG GCTGAjSCTGG CTCTQCTCTC CCX^JGOQAGC TCTGC3TOCTG 180 

GACTTCAACA ACATCAGGAG CTCTOCTQaC CTGCATPGQG CCCGGAAGGG CrCSkCSMWGC 240 

CTGTCTGAGA OGGACTGCAA TACCAGAAA6 TTCSSCCTCC ASCCDOGGGA TGAGAAGOQ6 300 

TTCieieClK CAIQTOGIGG GTTGOSGAGG AGOKGCCluac GAGWrGCCSAT GTGCCGCSCXn* 360 

OU OQGACACrcr GTCXeAACSA TBTTTOTACT AOOATGGAAG KTGXSACCXXl AAIATTAGAA 420 

AGGCaSCntS ATQAGCAAGA TGGCACACAT GCAGAAGGAA CAAC3?(3GaCA CCCAGTCXaG 480 

GAAAACCAAC CCAAAAGGAA GOCSlAGTATT AAGAAATCAC AAGGCAGGAA GGSACAAGAG 540 

GGAGAAAGTT GTCTGAGAAC TTTTGACTGT OGOGCTGGAC XZTGCTGTaC TOGrCATTTT 600 

^ IGGAOGftAAA TTIGTAAGCC AGTOCTTTTQ OaOSGACAOG TCTQCIGCSG AAGAGGGCAT 660 

OD AAAGACACT6 CICAAGCTCK AGAAATCTTC C3U3GGTTGGG ACXiSBGaCCC TOGACXACIG 720 

TGTGSAAGCC AATTGACGAG CAATOGGCAG CATGC=rC39AT TAAGASTKI6 CXSUULAAAXA 780 

GAAAAQCIAT AAAllA'lTITCA AAATAAAlGAA GAATOCftCAI TGCAAAAAAA AAAAAAAAAA 840 
- A 

70 Seq ID TOOs 628 Protein sequence 
Protein Accession #: NP_8S5235 

1 11 31 31 41 51 

iJ MVAAVLLGLS HLCSPLGAt>V LDP!9EIIR8SA OLBGARIOSSQ CitSDTDCaSTa KFCLQPSDEK 60 

PFCATCRGUR liRGQRDAMCC! POTLCVUDVC TTMBDATPIIi ERQU)EQtX3T BABOTTGHFV 120 

QEKQPKRKPS IKKSQOaKGQ EGBSCLItTFD OSPGLCCAfiH FMTKICKPVD LBGQVCBKBG 180 
HKDTAQAPBX FQRCDOGPGL LC&SQI.TSHR QHABUCVC3QK ZBKTi 

80 Seq ID HQs 629 BMA sequence 

Nucleic Acid AccesBioa #s Hl4_002448-1 
Coding sequencei 241,. 1134 

1 11 21 31 41 51 
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GCGCGAGTGC 
ACDGOaSGAGC 
AOGCCOCOGG 
CGGCTGGCCA 
ATGA.CTTCTT 
06AGGCX3CGG 
GAGGAGGGG6 

CAGGCGGCGG 
OACQCGCXXT 
CCAG2UU3ATG 
CRQAOCCCCC 

CTOGAGGOCA. 

GCAAAOAGAC 
CXHACCSGGCTG 
GOGGGTQCCT 

GCAGGTCCOC 

CTCTTTOCTC 
GTCCCTTAGT 

CTAGAGGCCA 

AAAAAAAAAA 

TCCaVntSGAR 
ftCACROATGr 



TCCCGGGAAC 
TGGCCTGCTG 
CCCTCQCAGA 
GTGCTGCX3GC 
TGCXZRCTOCSQ 
GCCAGGCXrCC 
CCAAGCCCAA 
ACAQGAAGCC 
GTGGCTCGGC 
CTTCECCQCG 
CXSCTCGTCftA 
QCTTCTCCOC 
AGACX3AACC6 
AQTTCCGCCA 
GCCTCACTGA 
TACAAGAGGC 
CCTTCGGCCT 
CGCTCTACGG 
TCTACACX3GC 
ACCTQTGOGC 
OSGCAOOGCC 
CCTOAGTTCA 
ACTCrrCTAO 
GGTTAACftGA 

TGTCTccrac 

TGXCTGACC^ 
OAAAAGAQAA 
AATTTAAGAC 
GFTTGCAAAOG 



TCTGCCTGCQ 
GGGAGGGGCG 



AGAAGGGGGG 
TGTCAAAGTG 
CAGCGCDGCC 
AGTGTCCCCT 
GGGGGOCAAQ 
GCAGCCRCTG 
GCCGCTGGGC 
AGCOSAGAGC 
GCOGOOGGDC 
TAAGCXZGCGG 
GAAGGAGTAC 
OACGCAGGTG 
A6AGGT6GAS 
CTCCTTCCCT 
TGCCrCTGGC 
OCATtTTGGGC 
CAGCOGATTC 
AGCXraCCTTC 
CTCTOOGAAG 
CATTTAGSkTC 
TTTATCTAOG 
ATJWaCTTTTC 
TTTTQOAAAT 
AAAAA»5ACT 
AAOTTCftACh 
TA6GTTGAAG 



OGOCOQCAGC 
GGAGGGGCaC 

GcrrccxMCC 

GCCCXJGCTCr 
GAGQACTCX53 
GCGGCCAOSa 
TOGCTCCTGC 
GKSRGCGCCC 
G6CGTCCXX3C 
CZATTTCTCGG 
CCOGAGAAGC 
AOaCGGCTGA 
ACGCCCTTCA 
CTQTCCATCG 
AAGATATGGT 
AAGCTGAAGA 
CTCX3GOG3CC 
C3CCTTCCAGC 
TACZVGCATGT 

ctggagccct 

CCTTTAACCC 
TCTGATCCXrr 
TACACTCTCG 
GTCCOCAGCA 
CCTOTCCTGA 
GAGAACAATC 
AGCOUSCCAG 
ACAAAACATT 
OGA 



GACCOGAGGC 

GCXSGGAGGGT 

CGCC!a3GAGC 

6CRTGG0CCC 

CCTTCXMCAA 

CAGCCGCCAT 

CCTTCAGOGT 

TGOCOCCCTC 

CGGQGTCX3CT 

TGGGGGGACT 

CCGAGAOGAC 

GCOCCCCAGC 

CCftCCXSOSCA 

CCGAGCGCGC 

TCCAGAACCG 

TGGOCGCCAA 

CCJQCAGCTGT' 

GOGCXaSCGCT 

ACC^ACyZTTGAC 

GGIGCTGTAC 

TCACACTGCT 

GCCAAAAAGT 

ASTTAAAGAT 

GAATTGACA0 

CACCAG6CAA 

TCAAAAAAAA 

GAftGATSAAT 

tgctctgggg 



CAGGCCCAGC 
CCQCCCOOCC 
CCATGCCOGG 
GGCTGCTOAC 
GCCGGCGGGG 
GGGCGCGGAC 
OQAOGOGCTC 
CGA0QG0GT6 
OGGftGCCCCG 
CCTOWGCIG 
CCC6TGGATO 

cTGCAOCxrrc 

GCTGCTGGOG 
QQAGTTCTCC 
CCGOGCCAAG 
QCCCATGCTG 
AGOSGCOGCG 
GCCTGTGGCG 

atagagggtc 
cccxtqacgtg 

CCRGTTTCAC 
OaCTGGAAGA 
GGGGAAACTG 
TTISAACAG7U5 
GAAAAOCQCA 
AAAAAAAAAA 
CCTAGCTTCr 
GGCAGGGMVA 



Seq IB HOs 650 Frofcein sequence 
Protein Accesaloa #: IiP_0024B&,l 

1 11 21 31 41 SI 

I I 1 I I I 

MTBi:.PLGVKV SDSAF6KPAG QGAGQAPSAA AATAAAMGAD EEGAKPKVSP SLtiPFSVEAX* 
KAOHHKPGAK SSAItAPfiBGV QAAGSSAQFli GVPFGSLG&P DAPSSPRFLO E[F3VGGLIiKIj 
PKDALVKAB8 PESPBRTPKM QSPRFSPPPA KRLSPPACTL RKIDCnilRKPR TPFTTAQUA 
UVSK?RQKQT LSIASRAEPS 88LSLTKTQV KXHFOHRBAK AKRLQBAEEiE KUCNAAKFHI. 
PPAAPGLSFP LGGPAAVAAA AQAdliYOftSG PFQIIAKLPVA FVOLYTARVG Y&HYBLT 

&eq ID WO: 631 DNA geguencfe 

Nucleic Acid AcctiSfilocL tt; IIHj00255*' . 1 

Cbdi.Tig s^qaence : 13 .. 2 0 4 9 



1 
1 

CAOGATGOTQ 
GGOCCTGOCT 
GCCTTTGCCT 

ctctacccag 
atc3gg<::gggt 
c36tgaaaa6t 

GftCXTTTTCT 
CTCIICTTAA 
CCGAGGCTGC 
^TBTGGGCT 
G<£UU3TTGGG 
AAATCTTOSG 
ATCATGGQGA 



TTGGCTTATT 
CAGfTATGTCC 
TTCAGTTACA 
TCGGACATOG 
GTATTOAAIG 
Citsl'dVICrG 
TOGAOUACJXJ 
GGAAAGTGTG 
ACTGTATCCC 
ATGAOCATGA 
GTGCaOTCSlXC 
GIGACOGCTG 
CSVrCAGrTCTG 
CTTAQACAQA 
CCCTCCAGAA 
TTQACZICTG 
GAAAACAGOA 
GCrXTTGACA 
CCTCRAACAA 
GAAGCCTAAS 
ACAIGTTGQA 



11 
I 

A6ATGTGGAA 
CTQCCCATAA 
OGATCTTCCC 
CAAITGAACAA 
AGTTCAACAA 
GGAACTTIGG 
^TATTGCTTC 
TCTTATATCC 
7TGAAGAGCT 
TGCTGTCTGC 
TTCTAGGAAG 
AAAGGTTCAC 
CATATSCTAT 
TCCCCACCTA 
QAGGGATOG6 
TTGAOATTTS 
OGTATGCCAA 
AOGCATGGrr 
ATGACCJTCAG 
ATATCCTGGT 
CTGTGlUVriC 
ATAOTAAGAT 
AAAATATGAC 
TllGGaAAGCA 

AKTCIGTGAC 
GGGAAAAGI^ 
TGAtSCCCTQQ 
ATACTUvTQGC 
ACATATCAGT 
AGCiTGGGCAC 
TGATGCTGTC 
ACCGCTTTC5T 
GTCCTCTTTC 
CXZCCTCTGGX 

AGCCrrCTCA 



21 

i 

GCTOTTQCTG 
ACTOGTGTOT 
OCATGACCTG 
CAATCAGATT 
ACIAAAGGAG 
CAOCTCAAQA 

AormxATCC 

TGGACTAAGA 
CCTGTTTGCC 
TGCTGTTTCT 
ACTCCIGGAT 
AGQACATAAT 
GAATTATTGG 
TGQACX3TACC 
ACCAGCSVTCX 
TTGCTTTGTC 
CAAOOGGAAA 
TATAAGGCGA 
GQQCACXWTC 
GCE9GGCXGAG 
TTCRAGCACr 
TTTGCCCCCA 
TATAACTCOT 
CACTGTftGCT 
TCATCAGTCC 
CAeCOGACAG 
CCTGACCOCT 
A£3SAACGACT 
CCCTAGAAGG 
CACCCCXGAA 
TC3VCCCCAGG 
CTCCAGC0C3C 
TCCCaXCTAT 
TCTAAAAAAA 
GTCAGAAACC 
TOCCBGGGCA 



31 

i 

TOGGTT6GGC 
TATTTCACCA 
GACCX3CTT7C 
OTTGCXAAGG 
AGGAACAGAG 
TTCACCACIA 
CXTCTQAG6A 
GGCAGCCOCA 
TTCI^QGAAGG 
aGGGTCCX:AC 
TTCATCAATG 
AGCCOCCTCT 
AGAAAGCTTG 
TTTCGCCTCC 
CCAGGGAAG7 
TGOGGIUSOSA 
GAGTGGGTTG 
QRGCATTTTG 
TGTGGCACTG 
TTCftGTTCAA 
GACOCIGAAA 
OGAGGAGAGG 
AGAGGXAC;^ 
CTAOGAGAGA 
ATGAGCCCTG 
AAGACOCTGA 
GTGOGTCATC 
ATQACCCCTG 
AASGCTGTGG 
GGGCAGACTA 
ATGGGTAACIT 

gtcatccrgc 
ggaaaccatt 

GAAATCC3CAG 
AGGGAAAACC 
AAQCAGGCAT 



41 
I 

TQaTTCTTGT 
ACTGGGCACA 
TCrGCACCC3\. 
ATCTCCAGGA 
AGCTGAA2U^C 
TOTTQTDCSUd 
CACATGACTT 
TGCATGACCG 
AQGCACTGCT 
ACATPOTCCA 
TCITGTCTTA 
TCTCTCIGCC 
GGGCACCCTC 
TCAAftGOCTC 
ACACCAAGCA 
AGAAGCACIG 
GCTATEERCAA 
GOGGGGCCAT 

Gcccnrccc 

CTTCTTTACC 
GGCTGGCTGT 
CXGGGGTCAC 
CTGTGACCCC 
AOACXGAGAT 
GAGAQAAGGC 
CCTCCGXGG6 
AGTCTGTGAC 
TCCATTT T CA 
OOCGTOAAAA 
TGCJCTTTAaS 
TGG6TCXTCA 
TCCOGGAACA 
C5CTCTGTCAA 
AAAACTCTGC 
CXT6TCTTTT 
CAAAAOCAGA 



60 
120 
180 
240 
3 00 
360 
420 
480 
540 
600 
SGO 
720 
780 
640 
900 
960 
1020 
1O80 
1140 
1200 
12 60 
1320 
1380 
1440 
1500 
1560 
1620 
IfiBO 



SI 
I 

GCTGAAACSU: 
CAGTCGGCOV 
CCTGATATTT 
TGAGAAAATT 
ACTACTOTCC 
ATTTGCCAAG 
TGATGGTCTT 
GXGGACTTTT 
CACCaTGOOC 
AACATCCTAir 
TGACTTACAT 
XGAAGAOCSCX: 
AGAGAAGCTC 
TAAGAAIGGG 
AGAAGGCXTC 
GATTOATXAC 
TGCCATCAGC 
GOTOTGGACA 
CJCTTGTCTAC 
ACAATXTTGQ 
GACCAIC3GGCA 
TGAGATOCAC 
'CAC3UUU3GAA 
CACrCGGGCA 
CCTGACJCCCT 
TTATCAGTCr 
GOCTGTGAer 
GACTGAGACC 



AGOGGAQAAT 
GAlOCSAAGCf 
AACTOCTCTA 
CTCAGTAACC 
TGTGGATGAA 
CTTCTAAGTG 
ATAGQCCAAT 



€0 
120 
180 
24D 



60 
120 

lao 

240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 

load 

114 Q 
1200 
1260 
1320 
13B0 
1440 
1500 
15£0 
1620 
1680 
1740 
IBOO 
18S0 
192 D 
1980 
204O 
2100 
2160 
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5 



20 



CTCTTTTCC& TTAAATAAAC TGTAAACACA AOIUVCCCA 

Seij ID NOs fi32 Protein sequence 
Protein Accession NP 002546.1 



1 11 21 31 41 SI 

I I 1 t ! I 

MWKi:.Ll.WVGL VLVIiKHHDGA AHKLVCYFTN MAHSRPGPAS ILPHDLDPFL CTHLIFAPAS 60 

MEMUQIVAKD LC13EKILYPE FWKIiKERIjIRB LKTMiSIGGW JIFGTSHFTTM I^STFANREKP 120 

lU lASVISbLRT HI>FIX3£DI>FF LYPGLKGSPM RDRHTFLFZiI EBIiIiFAFRKS AUjTMRPRIjIj 180 

LSAKVSGVFH XVQTSYDVRF liaRLLDFIKV LSynitHGSWB RPTGHNSPIiF SliPSDPKSSA 240 

YAMMYWRKLG APSEKilKGI PTYGKTFRLL KASKMOLQAR AIGPASPGKY TKOfiSFlAYF 300 

EICSFVHQRK KHWIDVQYVP YMIKaKBHVG YDNAISPSYK AWFlHREHFCS GAMVMTtDMD 360 

- _ XmBGTFOGRTG FFPLVYVUID lEiVKAEF&ST SliPQFnijSSA VWSS3TDPER LAVTrAWTTD 420 

ID SKII.PFGGE& GVTEIHGEDCB Si»riTPR6TT VTPTKBTVSL GKHTV3^EK TBITCAMTMT 48 □ 

SVGHQSMTPG EKAIiTFVGHQ SVTTGQICTLT &VO¥QSVTPQ EKTLTPVGHQ SVTPVGBC2SV 540 

SPQGTTMTFV HFQTETLROW TVAPRRKAVA REKVTVPSRN ISVTPBCSgTM PEilK3BI!ILTSB 600 

WTHFRMGUI. GLQHEASN5M MLSSSPVIQL PBQTPIiAFEfKT RFVPIYGMH3 SVNSV1:PC2IS 660 
PI*5liKKBIFB NSftVDBBA. 



&eg ID NO: 633 DMA flecnience 

NUclelc Acid Accession #s HM_0038B5.1 

Codizig sequences 98..10ai ~ 



25 1 11 21 31 41 51 

1 I I I I I 

AAACXCAGAA TTTTCX3CGG6 CTCGGlGAQC QQTTTTATCC CTCOOOCCX9G CAGGCTOGGC 60 

□GAOaOGGOT AGCCCC05CC CQGCaCQCAG CAGCAOCATG GGCAOSeTGC TGTOCCTGTC 120 

TCCCAGCTAC GGSanSQCCA CGCT6TTTGA GGATQQC3CQ GCCACCX5TQG GCCACT7VTAC I BO 

J\} GGCa3TACAO AACAGGAASA AOQOCAABGA CAAGAACCTG AASCX3CCACT CCATCATCTC 24.0 

CGTGCTGCCr "rGGAAOACSAA TCGTGGCOGT GTCQGCCRAQ AflGftWSAACT CCAAG2UU3GT 300 

GCBGCXITAAC AGOySCTACC AQAACAACAT CACGCAOCTC AACaATOAOA ACCTGAAGAA 360 

GTOGCTGXOS TOCGCCAACC TGTOCACATT COCCCRGCOC CCACOGSOCC AGCOQCCTOC 420 

AOCCCCC3QOC AGC3CaGCTCl CQQOTTCCCA GACOGGtSGGC TCCTCCTCns TCAAGAAAGC 480 

J J CCCTCACCCT GCC3GTCACKT CCGCAQGGAC GCCXaAAOQQ GTCATCGTCC MSGG&TCCAC 540 

CAGTOaOCTG CTTOQCTGCC TGOOTaAOTT TCTCTGCCGC CGGTGCTACC OCCimU3CA 600 

CCTGTCCXX:C ACQGACCOCG TGCTCTGGCT GCGCACSCQTG OAOCGCTOGC TGCCTCTQCSW 660 

GGGCTOOCaG GACCAGGGCT TCRTC3U33CC CSGtXaACGTG GTCXTCCTCT ACATCCTCTG 'J20 

C9GGG»ZGX!E ATCTCCaXSOG AGGTGGGCTC G(3ATCn.CGAG CTCCAGGCOG "TOClQCTtaac 780 

^Kf AIiGK3ClGTAC CTCTCCTACT CCTAlOirGGQ CAACX3AGATC TCCTACXZOOC TCAAGCCCTT 840 

OCTG6TGGM3 AGCTGCAAGG aOGCCTTTTG GGAjDCQTTGC CTCTCTGTCA TCAACCTCAT 900 

GaOCTCAAAG ATGGI6CAGA TAAATOCCBA COCRCACTAC TTCACACAiJG TCTTCTOOGa 960 

OCTG ABGAA C QAOAaCSQGCC AGSAGG&CAA GAABOQGCTC CTCCTAGGCC TGSATGSBT0 1030 

AGCACTOTftG CCTGOGTCAT GGCTCAAlSGA TTCAATGCAT TTTTAAlSAAT TTATTATChA 1080 

Seq ID KOs 634 Protein sfrpiwire 
Protein Accession #i NP_003 876,1 

50 1 11 21 31 41 51 

I I t I I ) 

HGTVliSliSPS YKKATLFBDG AATV<mVTAV QNBKNAKDKH LKRESIISVI. FWICHZVAVSA 60 

KKKW SKOT QP HSfiYCSHHZTB LNfilEKIiKKSL SCAHLSTFAQ PPPAQPPAPP ASQL&GSQTO 120 

GSSBVKKASa FAVTSAGTPK ILVIVQASTSB ZrltSOiGEFIiC RRCSfRLKHLS PTDPVI*m.RS 180 

DZ> ^maSXiCiXQGH QDQGPITPBM WFLYMIiCBD VISS&VIGSDH ELQAV£E>TCL> 7LSY8YMGN6 240 

XSYPLKPEliV SSCKBAFNDR GLSVUIZMSa KHLQnaADPH yPTQVPSIlIiK HESaQBDKKR 300 
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TABLE 79ft; 



Ptey: Unique Eos probesetidenlifier number 

EcAiDCfi: Exemptar Acceeskxi dumber, Genbank accession number 

UnlgenelD: Unlgene nuiriiBr 

Uni^enQ TltteUnfgene gene bUe 

Seq ID Na: Sequenc6 IdenOlfcatkm number linking InformaOon In Table TSA b aequmoss in T^ls 80 
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418007 


Ml 3909 




41873B 


AW3B8633 


Hb.6682 


443&4S 






4099^ 


AW103364 


H5.727 


422BS7 


will Of 


116. 1004 


4443&1 






421S&2 


AI9tQ275 


nSitioiMfii 


411769 




lie niKj 


452281 


T93SD0 


Ufl 911709 


428690 


AA852773 




421552 






42S247 






432201 




Hs 90(1941 


447377 


X77343 




446921 


/\DU l£l Id- 


na> jo^iW 


4168B9 




Me nfldjlfi 


432179 


X7580S 


flGu&if 10 
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Urn ICM 


409899 
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411975 


niaiowo 
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U7762ISI 




447400 




lie lAK? 


449032 




U« 99qnn 


415214 




ns>i ^Diz*} 


443247 






422043 




ns^oo 1*0 


4in£iA 


I/O loiU 


Ua l%qq'K 


446342 






411274 


MM n0977R 




104976 






i^2260 




Hs.1054S4 


409041 




rn.DUUul 


420844 




)-la tY7lni 

ns^r lUl 


4d?SR:i 
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437935 




tie DQAei 


422330 




ns. 1 twM9 


40B90B 






407811 


AWkfiqnpn? 

Mi* iau9U^ 




437952 






400243 


Y007&7 






nr Uggwr 


ns.iQOisv 
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417771 


AA80469B 


HaJ32S47 


421379 


Y1S221 


Hs.103982 


442006 


AW97Sie3 




413048 


M9a221 


K&.75lfi2 


443324 


R44013 


Hs.164225 


424917 


A]6a620B 


H6.segoi 


424917 


A163620B 


HaJ9B901 


444SZ7 


NM.00540B 


Hs.1ia83 


442652 


A1Q05163 


Hs^1378 


450726 


AW20460O 


Hs^55462 


41G965 


N26223 


HS.18043S 


442275 


Am49467 


Hs.54795 


431745 


AWg7244S 


HS.16342S 


431745 


AW972448 


H8.16342S 


453142 


AAG33S4a 


Hs7473 


4216S9 


T4M.014459 


Hs.106511 


444090 


SG9115 


Hb.10306 


«1563 


NM.006433 


HS.10SB06 


430413 


AW342182 


Hs.241392 


414091 


C1769B 




41SS33 


AA251131 


H5^20697 


424943 


AU0772eO 


MB.1S3g24 
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spoflifti 2t extracdhilar matrix prolsni 
trensmemtvais protaaEe:, serine 4 
sduiB carier farriiy 7 (caQontc amino 
1(ainkreb10 

Homo ea^QVB, Similar to RIKEN cDNA 2010 

paganaratins gene IgpB W 

HypoUKlicsl pratdn. XP.OSISBO (KIAA1 10 

puCaHvB G protelfK»up)ed receptor 

promlnfn (m«i9eHi» 1 

mucki 13» epitheliBl transmBmbreiie 

eplreguliti 

serine/threontna kinase 15 

cysteine knol fiupediamily 1, BMP antagon 

putaOveGPCR 

ntaibii^ 8 

ATPase^ Class I, lype 8Bl menlbar 1 

wma domain, knmuno^Dbiifln doiTkaIn Qg), 

ufaiquifiii eaniBr piQtain B3fC 

ERai{8.cerev!stee)-iike 

ratlnolc acid recepkv rasponder (tazaro 

GffnaKmdkJdblBqftokhiesobfoniniy B {(^ 

ESTs,Wealdyslniarb&72482hypDtiHti 

msnnoea receptor, C ^pe 1 

E$T5 

liypQihstfcei pnibiin FU23049 

hypolhelteal pnridh FU23049 

small IndudUscytoHne suUamly A (Cy 

Homoa3plen8cDNAFU40427fiB 

HUMPSPBAHtaman pulmtttafyauifactBRt^aaso 

MDAC1 

lkn»8aplefissecre(ogiabln,feiiil|y3A, m 
Novel FGENESHpPBdfcbdcadhaifn repeat 
Novel FGENESH predSded Cfidherih repeat 
Homo sapiens gapfuncBon protelUp alpha 
protocadheiln 17 

nahjrait loUer cal graup 7 sfiqaence 
grenu^sln 

snum Induclbbi^ldne A5 {RANTES} 
Homo sapiens uiwegulated by BCG-CWS (UQ 
Homo sapiens byptopftanyl-IRHA ^itietas 
deaOi-Bssoclated pn^ Idnase 1 



SeqIDNo. 

5e(|lDNo.C1&C217 

SeqlDNaC2&C21B 

8eqlDNaC3&C219 

SeqlDNo.C4&C22D 

SeqDNo.C5&C221 

SeqDNo.C6&C222 

8eqiC}NaC7&C223 

SeqtDKa.C8&C22i 

Saq(DNo.C9&C22S 

&eqn>lto.C1ff&C226 

SeqlDfli>.C11&C227 

SeqlDNoiC12 

Saq1DNc^C13&C228 

Seq1DNo.C14&C229 

Se({lDNaC1S&C230 

8eqlDNDLCl6&C231 

SeqlO»laCl7&C232 

5eqlD}Jo.C18&C233 

8eqlDNo.Cl9&C234 

8GqiDNo.C20aC235 

S9qtDKo.C21&C236 

SeqlDNij,C22*C297 

8eq1DNa.C23&C2aa 

SeqJDNaC24&C23d 

$eq1DNaC25&C240 

&e{ilDt4aCZ6&C241 

&eqlDNaLC27&C242 

SeqID)lo.C28&C243 

Seq)QNo.C29&C244 

BeqlD NO.C30&C245 

SeqtDKo.C31&C246 

Seq)DN0.C32&G247 

Saq)DM(xC33&C249 

Seq1Dt^C34&C249 

SeqlDNoLC35&C250 

5eqlDNo.C36&C251 

5B(}DNauC37&C252 

SeqlDNa.GaB&C253 

5eqlONQ.C39&C254 

BeqlD>Io>C4a&C2S5 

8eqlDNe.C41&C256 

8aqtDNdLOI2&C257 

Seq|DNbkC43«iC25B 

SeqlDM&G44&C259 

SeqlDNaC45&C260 

8e4lDNo.Cd&&Ca61 

Se<)IDNaLGt7&G2B2 

Seq]DNaC48&C2S3 

8eq]DNe.C43&C264 

SeqlONo.C50&C2fi5 

SailDNo.CS1&C266 

8ei|tDNo.C82ftC2S7 

SeqiDNbwOS3«C26B 

8aqlDrtoiC64&C259 

SeqlDNo.CJS&CZTO 

8eq ID Na 055(0271 

Se<]IDNo.C57&C272 

SeqlDNo.C5B&C273 

SBqlDNo.C69&CZ74 

SeqlDNo.C60&C275 

SeqtDNo.C6t&C276 

SBqlDNo,C62&C277 

8e(|lDNbwC8a&C279 

Seq ID No. €64^0279 

SeqlDNoLC85&C280 

SeqlDNDLCG5&C2&1 

SeqlDNaC57&C282 

$eqlDNo.G6a&C283 

8eqlDN0L069&C284 

SeqlONo.C70&C285 

Sflq|DMo.C7UG266 

8eqrDNo.G72&C2B7 
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H8./Dt»r 


415817 


UBB967 


Lht 7fiO&T 


4Z1B17 
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rS.lUoooU 


4<t3olf 


Arl4w/4 




409420 


Z15008 


Hs.644S1 


44lAD'a 




ns./o^i 




AVYDowef 
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rlS.4fC0u 




russbu 


iJik 4Qft/a7 


414/ f*l 


AUy£41V 


H8.77274 


41009I 






453857 


At nDfl^!lJC 




4491 Ql 


AA2ljao4f 


tin OOftlC 


AtonB.'i 
4Z9Z03 


ivwiyuU4 


Li_ i-omne 


4)dl4r4 




nS.1U4oof 


4iilf 30 


bE:J14U/o 


nS.1U79ll 




Mti Mne7c 


Lin JC7JO 


426761 


A101S7Q9 


HS,172QB9 


42973B 




Hfi^2680 


4309B5 


Aft4SOZ3Z 


HS.273Z3 


431o9U 


X17033 


rfeZ71Vo6 


432583 


AV«j23624 


n&.t622B2 


446B72 


X97D68 


Hs.16382 


4wnuz 




n8«9lQQ4 






rlSuio4oiJr 


4«JUZoU 


A/V£}1^90 


t t- 'M7DCQ 


428486 


AW963497 


rl8.l64QU4 


4o/<WBf 


AUUMMc 
MBwiBlS 


U» 


432of4 


VUDAVn 


nS.Z#909i 


aaksuxa 
44qwl 


AUimM9J9 

AW9a1d4Z 


n9Ll984aD 


440ni 




flS.1WtQU 


4W0oZ 






*M&4f 




Hu937ti 








4U4ZB7 






40429/ 






41CKIIO 


U4f/il2 


nS.a4U72 


444754 


TB3911 


K&11861 


432596 


^1224741 


Hs.27o461 


444006 


BE395UB5 


H3.334/62 


428505 




Vie 99ft1 


448844 


Al5ai519 


H5.177164 


44B844 


AI581519 


H3-177164 


42B392 


H10233 


Hs.2265 


448030 


N3D7t4 


HS.3259G0 


4221(9 


873265 


H8.U73 


448048 


Z45051 


H^^20 


417931 


W95&42 


Hs^61 


41921B 


AU076718 


Hs.164021 


426227 


U67(SB 


Hs,1542d9 


413554 


AA319146 


HSJ6426 


445417 


AKD01058 


Hs.12fiB0 


426322 


J0SDG8 


HB^iOia 



glyplcani 

tumor necfogls factor, al^ha^nduced pro 
v-erb-b2 avian erylhrabksb'c leu)(etTila v 
caitonic anhydrsae IX 
£Gluta earner fanly 16 (monocartiDxytk: 
Innsfbninlng growth factor, beta-induoed 
chtoflde channel, caldum adivaied, fam 
matrix mofdloprotoinaas 12 {maoropliage 
Trabix metatlopiQleinase 12 (macnaphage 
uiDpteMn IB 

cadherfn 3, type 1, P^henn {jplacenta 
sofule canier famSij G (neurolranssrltts 
sohifg cBniarfemI]r2 (fecllHaled gla 
solute Gorrierfaml^ 7 [catlonlc anno 
gap juncOon pFotsbi, beta s (oonnexiii 3 
hepsrin^indir^ ^ow^ factor binding pr 
FLJ20522 Hypolheticdl protelR FLJ20S22 
ntalrix metaHoprotebasa 9 {gMiase B 
}nlBorIn,t)eta4 
G pFot»fivcaupted receptor 87 
immiinoglliolniiin eupeifannOy, mamlbor 9 
byalurcnan synthase 3 
glyoopR}(e1n (tF&nsinem1)ran&) nmb 
praten ^rosiie phospliatasa, reoeplor-t 
protein tyrosine phosphatase, reoeptor-t 
protein tyioaine phosphsAasd, leceptor-l 
protean ^roslne phosphatase, iccaptor-l 
proliS^n (yrosTia phosphatase, reoGplor-t 
protaki ^fiosHie phosphatase, reoeptor-t 
protein ^froslne phosphaiaae, raoeptar-t 
prctelii ^joslne phosphatase, receptor-4 
ATF-binfng cassetifi^ sub-famSy C ( CFTR 
ATF-bindtng cassetlQ, sub'lanii^ C (CFTR 
lamlnin, gamma Z (pTcan (lOOkO), kfiillrff 
daudlnl 

neuiDticphic tyro^ne kinase, iBceptofk 
naurolrophic fyrostae Idnase. leceptor, 
neurotn^rfibfyfoslnetunasa, receptor, 
hypolhetlcsi protein )(FL098151 (teudne- 
pfssniRiogen acllv^, urolOnasa 
ATFase, Oass VI. ^e 1 1B 
Ra&Muced senedcanca 1 (RiSI) 
G ppotoin^pifidieceptor 
ATFMi'nidhig cassette, sulkiamliy A (ABC1 
soUitacanferfamlfy 1 {glutamatelians 
ATP-bintfing cassetta. sub-fam1(y B {MDfV 
adanosina A2b laceptor 
E>ORIMIK pR>«(»06l6 ncepior laducing me 
ttmor neaoBis factor reoeptor supeifand 
ESfTa, Weakly similar to 178885 sedneAh 
fn^rin, alpha 2 {CD49B, alpha 2 eubiini 
potassium channel TASK-^ patassium chan 
pyitiitldinei^s lecaptor P2Y| G-prataki c 
AiEzled (DrasophU^ homolog 10 
ptoxlnCI 

intedeukiii 7 receptor 
panorediG polypepfida 
caypBcoene 

malaiiama InhlUtoiy oAvi^ 
DPCRl protein 
DPCR1 protein 

oithobg dt mouse poiydomata protein 
FGENE8H pradictod novel saoeted imtei* 
FGENESH predicted ncwd CUMbmain oonla 
FGENESH predicted novd CUB-domaln conta 
FGENESH predicted fiovd CUBHhxnan conta 
transmorhbrana 4 supertanly member 3 
iraremeihbrane 4 supBrfisinlly memtwr 4 
melriEnS 

type I trensmemfaiane protoTn Fnl4 
chromogranln B (secreOogrsnln 1) 
FGBIESH pfiedocted novel cell surface pr 
FGENEGH ptedicled novel cell sixfecs pr 
eecFetoiy gramila. neuroendocrine protel 
membrane^panning 4rdoma'ns, subfamily A 
gastrin-feteasing peptide 
simj!artoS664D1 (cattle) glucose Indue 
trefoil factcr3 fnteaitinal) 
Binall ihducibte cytotdne subfarnlty B (Cy 
Human proleEnase aqitvated reoeptor-2aiR 
eecretogranla )l (chromogranln C) 
a disintegnn-lite and metaltopiotease vr 
IransoobQlamln I (v3fim*n B12 Uhding pr 



S6qlDNo.C73&C28B 
SeqiD No.C74&C2a3 
SeqlDKo.C75&C290 
Se(itDND.C76&C291 
SeqlDNo.C77&C2S2 
SeqlDNo.C7a&C293 
SeqH)No.C79&C294 
SeqlDtfo.C60&C295 
SeqlOI^C61&C2g6 
Seqn)No.Ce2&C297 
SeqlDND.cg3&C29B 
SeqlDNo.Ca4&C299 
8eqlDNo,C85&C300 
SeqlDKo.C86&C301 
SeqlDr«xCe7&C302 
SeqlDNQLC66&C303 
8eq1DND.C89&C304 
SeqlDNo.C60&C3Q5 
SaqIDKoi C91&C30G 
5aqlDNQ.C92&C307 
SeqlDNo.G93&C908 
SeqlDMo.C84&Ca09 
SeqiD No. C95&C310 
GeqlDNo.C96&C311 
SeqlDNQ.C97&C312 
SeqtDNo.C9B&C313 
&eqlDNo.C99&C314 
Seq]DNo.C100&C3l5 
Set} IDNo^ C101&C316 
SeqlDKaC102&C3l7 
SeqtDNaC1C3&C31B 
SeqlDNoL C104&C319 
SGqlDNo.C105&C320 
SeqlDNaC106&C32l 
Se(|IDND.C107&C322 
Se4LDNQ.C1€6&C323 
Seq]DMaC109&C324 
SeqID NO.C110&C325 
SeqlDti0LC111&C326 
SeqlDtto.0112&C327 
SeqlDNaC113&C32B 
Se(iU>No.C114&C329 
SeqlDNa.C115&€330 
SeqlDNbwC116&C331 
.Se(ilONaCll7iC332 
Se(|IDNaC11B&C333 
Se{i}Dt4o.C119&C334 
8e(|]DNQbC120&C335 
SBqlDNQXl2l&C33& 
SeqlON(XC122&C337 
8eqtDNQ.C123«tC33B 
Seq»7NaC124&C3a8 
SBqlDlto.C125&G340 
SeqlDNaCt26&C341 
SeqlDr2aC127&C342 
SeqlONo.C128&C343 
SeqlDNo.C129&C3U 
S«t1DNii.C13D&C345 
8eqlDrto.C131&C346 
SeqlONaCl32&C347 
Seqn}No.C133&C346 
8eq1DNo.C134&C340 
SeqlDNaC13S&C35l> 
SsqlDN0LC1d6&C35l 
SeqlDIto.C137&C352 
BeqlDNa.C139&C353 
Seq1DNo^Cl39&C354 
SeqlDrb.C140&C355 
5eqlDN0LC141&C3S3 
SeqlDNaC142&C357 
Seq!DNo.C143&C358 
. $BqlDNo.C144&C359 
8eqlDNoiC145&C360 
SeqlDMo.C146&C361 
Seq{DNo.CU7&C362 
9BqlDNo.C14a&C363 
8eqlDNowC149&C364 
Seq IDKaCt5D&CaG5 
SBqlDNQLCl5i&C3G6 
8eqlDNo.C'152&C367 
8eqlDNQ.Cl53&C3G6 
SBq1DNftC154&C369 
8e<tlDNdLC155&C37Q 
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413719 
431462 
410498 
413095 
426125 
43B729 
437145 
451820 
427557 



BE4395BO 

AW5B3672 

U33632 

AA4943fi9 

)ffl7241 



421340 
428187 
428187 
422278 
446618 
419452 
428242 
439659 
411825 
412314 
429150 
419073 
411B2B 
419508 
421779 



436872 
410288 
416370 
437062 
421481 
444151 
426174 
410037 
425071 
421829 
41B578 
419883 
419693 
448988 
446988 
448988 
448968 
430144 
408833 
482017 
415092 



443891 
42S976 
432800 



AFOmiB 

AW05B357 

NMLD02659 

AIJ033377 

F077B3 

AI6B7d03 

Ar667303 

AF072873 

AU076643 



H55709 

AW9707flQ 

AKDQ0334 

AA825247 

An2{l103 

AW372170 

AW161449 

AW9g753B 

AI87S159 

Ar091277 

AFOlTSae 

AA2a4879 

AA316181 

N90i70 

AAB6ie97 

AW391972 

AW972917 

AA547959 

AB020725 

NhLD13g89 

AB01B330 

AWgG8l59 

AA133749 

AA133749 

709783 

Y09763 

Y087B3 

Y€d763 

AI732722 

AW612232 

AF109302 

005837 

005837 

N^002250 

C750d4 

BE39104S 

AW380ZBZ 



426263 
421537 
43^3 
427715 
413049 
414555 
422424 
432378 
409041 

TABLET9B 



NM.001197 
BE3834B0 
HMJD04445 
BE245Z74 
NliUI02151 
N98569 
AI1 66431 
AMa304G 
AB093Q25 



Hs.754g8 

H3.256311 

Hs.79351 

Hs.30715 

H8.166994 

H8.351316 

Hsj5462 

Hs.1 99248 

MS.179S57 

H5v44197 

Hs.1369 

H&285S29 

Hs.285529 

H8.1 14218 

H3.313 

K5.90572 

HS.22SD 

H8.59483 

Hs.352415 

Hs.35eoa4 

H6,19736e 

Hs.1 83918 

HSJ2290 

Ns.90786 

Hs.l0e2l9 

HS.3Q2834 

Hte.31386 

K3.2S640 

Hs.6ie35 

H5.203697 

Hs.120591 

Hs.104698 

Hb.128749 

HS.11SB38 

Na^8009 

Hs,164424 

H8.108708 

H5,302740 

Hs.301350 

h3s.301350 

H3,22785 

HS.2Z785 

H5^27aS 

Hs^2785 

HBJ89Z7 

HS.25483S 

H827495 

H8.145007 

>fe.145B07 

Ks.10002 

Hs.334514 

Hs^89G2 

Hs^iiao 

Hs.145416 

H&.l554lg 

H&.1DS547 

Hsl3796 

HI5.18042B 

Hb.823 

Ha.76422 

HS.29S638 

H3,148133 



smaQ Induclbia cytokine Gitbifa(n9r A (Cy 
sr^nb^to neuroendoctfns pspltde pceco 
potassium channel, eubfacnlly K member 1 
potassium voltage-gated chennel, tGltfel 
FAT tuimr suppressor (Dnisophila) homolo 
bansmemtirane 4 superfavdy member 1 
solute earrfer family 4^ sodium bteaibon 
EST« 

plasminogen adtv^r, uroViRasa reoepb 
hypothetlcef protein DKFZp584DD482 
decay acoeteratfng factor fbr oomfdemant 
G piotelnpcoupted recepbr 49 
G pnrtrfn-cogpted receptor 49 
frizzled [Drruoptiita} homotog 8 
sscretedphosphopioteln 1 (osteoponlm. 
PTK7 pnriBfn lyroslne kinase 7 
leukemia inhlbitisfy fedor (chonnen^c 
1euc1n»fch repeak»nta'nli^ Q protefn 
sokilo carrier famly 39 (zinc transport 
G priDtap-oauplerf TBceptor 27 (GPR27) {S 
smoothened ^roscqit^lB) Iximolog 
(ransmembrBne receptor Unc5H2 mRNA 
wnglees'type htegralion site famT 
ATP-WndkiQ cassette^ svb^Mfy C (CFTR 
wlnglees^ype MU7V kitegraUoA site fains 
frizzled {praaopliila) tiomolog 8 
sGCTBteif irlz^ed-Tetaled prolan 2 (sir 
daudinS 

stxtransmezntsaie epHliellal antigen of 

CI»6 antigen (p45) 

ESTs 

KlAAl324pn^ 
a^34nelliyfaoyi-CDA raoaniaao 
Homo sapiens slmRarto Echincidin (LOCI 
KIAA001B protein 
delodtnase. iadotl^yiDntne, ^pa 11 
Dalclum/caliTiodiAiHlepaiidBnl proteb kfti 
Epilttallalcdicluiii channel;^ CaT-Tiie A 
FXYD domaliranls^ibis hM traBspoil reg 
FXYD domafin-contamng ion transport reg 
gamma-amlnobu^ add (GABi^ A recei^ 
ganima«m1nobutyil& add (GABA) A larxpto 
ganaiMnlnfiEiutydc add (GABA) A recepki 
gammMfflinQbu^'e add (GABA) A recepto 
ERGL protein; ERGt&534nce pn^in 
E5Ta 

praslfite cancor sssodsted pnteln 7 
itypothelcal pratefai FUldSOa 
hypoOwVcal proMn FU13S93 
potassfttn inl^BdiataftnisI! conduclaiicft 
KG22 protein 
AIM-lprotBtn 



endoglyean 
BCl2-]nteracting Uier (apopteteMic 
neursrt piollferallon, dHferanfiatfon an 
EphB6 

KIAA1181 piDtam 

hepsln (trsnsmentrane proteaaa^ aertne 1 
phoephoRpaw A2, gR»ip IIA (pidelels, 
prostele dafeienlTation fecbr 
ESTc 

Hypothetical prol9lh.}(PjD5186fl (KIAAIId 



8eqlONo.C15SftC371 
SeqlOND.a578iC372 
&eqlDNo.C158&C373 
SeqlDKo.a59&C374 
SeqlDMQ.C160&C375 
SBqtDNaC161&C378 
SeqlDNo.Ct62&C377 
Seq)DNo.Ct63&C37e 
Seq)DNo.Cl84&(^ 
Seq1Dt^aC165&C380 
Seq1DMa.C16B&C381 
SeqlDrtoX1574C382 
deqiDKo.C18g&C383 
SeqlDKQLCl6g&C3B4 
Seq1DNaCl70&C3as 
SeqlDNo.C171&C3fl6 
SeqlDNaC172&C3B7 
SeqlDNaCl73&C3Bd 
Seq]D(^G174&C389 
SeqlDNaC175&C390 
8eqlDNaLC176&C391 
SeqlONo.C177&C392 
SBqlDNoX178&C393 
8eqlDNo.C179&C394 
SeqlONo.C180&C395 
&eqlDNo.C181&C396 
8eqlDNo:G182&C397 
SaqtDNaCf63&C398 
&eqlDNo.Ct84«iC399 
Seq1Drro.C185&C40D 
Seq)DNaO188&C401 
Se<||DNaLC187ftC402 
8eq1DNbLCiae&C403 
SsqlDNb.Cie9&C404 
SaqlDNaC190&C406 
8eqlDNaG191&C406 
Se4lO»a.Cl92&C407 
SBqlDNo.C1S3&IM(ia 
S8qlDNo.C194&C409 
SeqiDNo. Cig5&C410 
SeqlDNo.C196&C411 
SaqtDND.G197&C412 
SQqlDNQ.a9B&C413 
8eqIDNdCig9&C4l4 
Seq1DKaC200&C4l5 
SeqlDNaC201&C416 
8eqlDHixC202&C417 
SeqlONo.C209&C418 
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Se<a ID NO I CI DMA Sequence 
Nucleic Acid Accession tt^ KM_005B14 
J Codling fieqpiezxce: 345.. 1304 

1- 11 21 3X 41 51 

i I I I I 1 

I -~ CTACCCCTTT GTGAGCAGTC TAQQACTTTG TAC3VCCT(3TT AAGTAOGGAG AAGGCAG6GG 60 

lU AGGTOGCXGG TITAAGGCGA ACTTGAGGGA AGTAOGOAAG ACTCCTCTTQ GSAOCTTTGG 120 

AOXAGQTGAC ACATGAGCCC AfiGCCCAGCT CACCZGCC3U^ TOCAiSCTGAiB OAGCTCACCr IflO 

GCCAATCCAiS CTGAQGCTGG CSCRQAGCfCQG GrrQAOAAGIUS GGAAAATTGC AOGGACCTCC 240 

AGTTOGGOCA GGCCAOAAGC TGCTGTAGCT TTAACCAGAC AGCTCAGACC TOTCTGGAGG 300 

^ CTGCcaOTOA CAGGTTAQGT TTiWSGGCRQA QAAGAAGCAA GACCATGGT6 G0GAA£5ATeT 360 

ID GGCCTGTGTT GfTGGACACTC TOTGCftGTCA GGGT&JlXGi: CGATGOCRTC TCTGTGGAAA 420 

CTCGGC&GOA. OE3TTCTTGG6 GCTTCGCMSG GAAAOAl^rGT CAOOCTGOCC TGCACCTACC 480 

ACftCTTCCRC CTCCAOTCGA QR0GGACTTA TTCRKEGGGA TAAGCXCCTC CTCRCTCWTA 540 

COSAAAdOGQT GGTCATCTGG CCSGTTTTCaA ACAAAAACTA CATCCATCGT GAGCTTTATA GOO 

AGAATOGOGT CAGCATATCC AACAATGCTG AGCAGrCX:GA OtSCCTCCATC ACCRTTCATC €60 

Zvr AGCTOACCAT GGCTGACAAC GGCACCTACQ AGrrdTTCTOT CTCGCTGAIG TC2U3ACCraG 720 

AOQSCAACAC CMGrCAOBT QTCCGCCTGT TGGTGCTGGT GCChCX2CTCC AAACCACMT 780 

GCGGCATGSA tSGQAGAGACC ATAATXGIGGA AiCAAiCATCCA GCTGAGCTGC CAATCAAACSG B40 

AOGigCTCAOC AACCGCTCAiS TACAOCTGGA AGAGGTACAA CATCCTGAAT CAGQAGCAQC 900 

^ - CCCrfiQC3CCA CCX^GOCTCA. GGTCAGCCTG TCTCCX3QAA OAATATCTCC ACAGACRC3LT 9S0 

ZD CGGGTTACTA CATCIGTACC TCCAOCAATG AGGA0GGGAC GCAGXTCliGC AAC3VTCRCGG 1020 

TGGCCaTCAG ATCTCCCTCC AIGAAOSTGG COCZG7ATGT GGGCATCXiOG GTGGGOSa:£» 108 0 

TTGCaCSCGCT CATTATCATT GQCATCATCA TCIACTOCTG CTGCTGCCGA OOGAAGGACQ 1140 

ACAACACTGA AjOACAAGQAG GATGCAAGGC CGAACCGGGA AGOCTATWO GAGCCAOCM 1200 

2V3CAGCTAAG AGAACTTTCC AGAGAQAQOa AGQAOOAGQA TGACTACRdGG CAAGAAQAQC 1260 

J\} AGAGGftCCAC TC3CiSCSTGAA TCCCC30GAOC ACCTCSACCA GTGACftQGOC AQCAGCAGAG 1320 

GGOaOCTCSAG GAAGGGTTAB OGGTTCATTC TGCCQCrrcC TGOOCTCOCT TCTOCnTCT 1300 

AAGOCCIGTT CrCCTGTCC3C TCCATCCCAG ACATTGKTGG GGACATTTCT TCTOCRGTGT 1440 

CAGCTGTGGO GAACATGGCT GGCCTGGTAA GOOGOTCCCT C3TBCTGATCC TGCTGACCTC 1500 

ACTGTCKTGT GAfliSTAACCC CTCCTOOCTG TGACACCTGG TGCXSGGCCXG QCCCTCACTC 1S60 

J*> AA(2ULCCAaaC TGCAGCSCTCC ACTTCCCTCG TAGTTGQCAQ GAQCTCCTGG AAGCACAGOG 1620 

CIGAGChraS OQGQCrCSCCA CrCAGAACTC TOC3K3GGAG6 CXSATSCCAGC CTTOGGGOGT 1680 

GGGCaaCTGTC CTGCTCACCT GIGTGOCCAfi CACCTGQAGO GGCACCAiSGT GGAOQGTTia 1740 

CACTCCftCAC ATCTTTCPTG AATGAATGAA AGAATAAGTG AGTAaOCTTQ GGCCCTGCaT 1800 

IXSGCCEXSGCC TCCAGCTCCX: ACTCXZCTTTC CAACCTCACT TCCCX3TAGCT GCCAGTATGT 1360 

4U TCCRAACCCX CCTaQOAAGG CXaCCTCOCA CTOCTGCIGC ACROGCCICTO QGGAGCTTTT 1920 

GCCCACACAC TTTOCATCIC TGCCrGPTCAA TATCQTACXn* GTOCCTOCAG GCOCATCTCA JL980 

WCTCftCAAGG ATTTCTCTAA CCCTATCCTA ATT6T0CACA TAOnQQAAA CAATCCTGTT 2040 

ACTCTQTCCC ACGTCCAATO AtGGGCC3U3l AGGCACAGTC TTCTGASaSA GTOCTCTCAC 2100 

. TGTATTAGAG CGCCA(3CrOC TTQGGGCAfSG GCCIGQGCCX CAIQGCTTTT GCTTTCCCTG 2160 

4D AAGCCCIAGT A6CTGGOGCC CATCCIACTO GGCACTTAA6 CTTAAT3GGG GAAACTGCTT 2220 

TGATTGGTTG TGCXnTOCCT TCTCTGGTCT CCTTGAGAT6 ATQGTAOACA CAGGGATOAT 22 BO 

TCCCAJCCCAA ACCCMOSaa TCATlCnGTB AGTTAAACAC GAATTGRTTT AAiUSlGAACA 2340 

CACRCAAGGG ASCTTGCTTG C3«3ATGGTCr GAGTTCTTGT GTC(?IGGTAA TOCCTCTCCA 2400 

GGCCAOAATA ATTGGCA7GT CTGCTCAACC CACATQGGGT TCCTGGTTGT TCCTGCATCC 3460 

JV CGATACCTCA GCCCTGGCOC TGC30C3iGCCC ATTTGGGCrC TQSTTTTCTG CTGOGGCTCT 2520 

GCTGCTGCOC TCOCAC3U3GC TCCXTCIQTT IGTCGAGCAT TTCTTCTACT CTTGAOAflCT 2580 

CAGGCAGOGT TAGGQCTQCT TAGGTCTCAT GGACCA6TGG CrGSlCTCAC CCAACTGCAG 2640 

TTTACrATTQ CTATCTTTTC TGGATGATCA OAAAAATftAT TOCIVTAAA3C TAtTGTCTAC 2700 

» „ TTGOGATTTX TXAAAAAATO TATATTTTTA TATATATTGT TAAJU^CCTTT GCTTCATTCC 2760 

-) J AAArrGCTTTC AGTAATAATA AAAarrGTXSGG TGG 27^3 



60 



Sect n> HOi C2 DSPk Soqaenca 
Budelc Acid Aoceefiioa ffx E 
Coding fiequ«iicei X..3I50 

^ 11 21 31 41 51 

I 1 I 1 I I 



ATGGGGAGCC GGACGCCAjaA GTCOOCTCTC CAaSCCGTGC AGCUGOGCTO <3QC3CCOOOOG 60 

^- OGCOGACOCC CGCrSSTGCX: 6CTGCTGTTG CTOCTSSTGC 03CX?£30CAOC CAGGGTCGGG 120 

O J GGCTTCAACr TAGACGGGGA GGCCCCAGCA. QTACTCIGG6 GGOCCCaSGG CTOCTTCTTC 180 

GGArrXCTCAG TGGAQTTTTA GOSGCC35GGA. ACAGAGGGGQ TCAGTQTGCT GGTGG6AGCA 240 

CCCAAGGC7A ATACCftGOQA GCXAfiBAGTG CXGCAGGGTG GTGCtGTCTA CCFCXfJTCCT 300 

TOGGGTGCCA GCCX!CAGACA GTGGACCCCC ATTGAATTTG ACAGCAAAGG CTCTCGGCTC 360 

CT GGAS TCCT CACTG TCC flG CTCSUaAGOGA GACSSAGCCTG TOQAGIACAA OTOCTTGCAG 420 

/U TGGTTOGGGG CAACAaTTOS AGCCCS^GGC TCCTOCATCr TQGCATG06C TCCACXGTAC: 4SQ 

AGCTQGCG^ CAGAGAAQGA QCCACTQAGC OACCOOGTGG GCAOCTGCTA OCTCTCCACA 540 

GATAACTTCA CCC3GAATTCr GGAGIAT6CA COCXGCOQCT (aaATTTCIUS CTGGGCJUSCA 600 

GGACAGOGTT ACTGCXMCQ AGGCTTCAGT GCQGAGrfCA CCAAflACTGG CCGTGTGGTT 660 

_ _ TTAGGTGGAC CftGGAAaCTA T^CTCTGGCaA GGCCAGATCC TGTCTGCCAC TCAGGAGCAO 720 

/D ATTGCAGAAT CTTATTACCC OGAGTAOCTG ATCAACCIG6 TXCAfiOOGCA GCTGCAGACT 780 

CGOCAGGCCA GTTCXATCTA TGATGACAOC TACCTAGQAT ACT CT GTGB C TGrTTGGTGAA 640 

TTCAGTGQTG ATGACACAGA AGACTTTGIT GCTGGTGTGC CCAAAfSGGAA CCTCACTTAC 900 

fSGCTATGTCA CCaVTCX^TTAA IGGCTCAGAC ATTOaATOCC TCTACAACTT CTCAGGGGAA 960 

Ofk CAGAnCGGCCT CCTACTTTQQ CTATQCaWSTG GCOGCCACAG AOGTCAATGO GGAaaCGCTG 1020 

OV 6ATGACTTGC TGGTQQGGOC AOCCCTGCXC ATSGATCGGA CCCCTGACGG GCXSGCCfCAG 10 BO 

GAGGTGGGCA CKSGTCTAOGl" CTACCTGCAG dVOOOUSGOG GCAiSAGUbGCC CAOGCCCAGC 1X40 

CTTACCCTCA CTGGCCATGA TGAGTTTGGC CX3ATTTQGCA GCTCCTTGAC OCCCCTGGGG 1200 

GACCTQGACC AQ QATGGC TA CAATGAIGTG GCCAXaaOGB CXCCCTTTGG TGGGGAGACC 1260 

CAGC3U3GGAG TAGTSTTTGT ATTTCC3GGG GGCCCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 
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CAGOTTCTGC AGCCXCTGTG G)GC3U3CCAGC CACACCCCAG ACTTCTTTGG CTCTGCCCTT 1380 

CGAOGASGCX: GAGACCTOGA TGGCAATG6A TATCCTGATC IGATT6T6G6 GTCCTTTGGT 1440 

GTGGACSUIGG CTGTGGTATA CAGGGGCOGC CCJCATCQTOT CCGCTAGTGC CTCCCTCACC 15 DO 

ATCTTC3CCOG CCATGTTC3^ CCOVSAGGAG OGGAGCTGCA GCTTAGAGQG GAACCCTOTO 1560 

J GCXTTGCATCA ACCTTA3CTT CTCSCCTCaUVT QCTTCTGOAA AACAOGTTGC TGACTGCATT 1620 

GGTTTCACAG TGQAACTXCA GCTGGACIGG CAGRRGCAGA AQQGAjQGOTT ACGGOGGOCA 1680 

CTGTTCCTGG CCTCCAOQCA GGCAACCCTO ACCX:nGA<XC TGCTCATCCA BAATGGGGCT 1740 

GGAGA6GATT GCAGASajGAT QAAGATCTAC CTCAGGAAC3G AQTCBGAATT TOSAGACAAA 1800 

CTCTCGCOGA TTCACATCGC TCTCAACTTC TCCTTGGACC OCSCAAGCCCC AGTGGACAGC 1B60 

lU CA OGGC CTCA OGCCAGCCCT ACATTATCAG AGCAAGAGCC GQATAGAGGA CAAQGCTCAG 1920 

ATCTTGCTGG ACTGTGGAGA AGACAACATC TCT6TGCCT6 AOCXQCABCT GGAACrTGTTT 1980 

GGGGAGCAGA AOCATGTGTA CCTOQGTQAC AAGAATOOCK TGAACCTCAC TTTCCATGCC 2040 

CAGAAT6TQG GTaAGGGTGG CGCCTATGA6 GCTGAGCTTC GGGTCACOGC OOCICCAGAG 2100 

^ _ GCTGAGTACT CAGGACTCGT CZAGAC31CCCA QGGAACTTCT CCAGCCTGAG CTOTGACTAC 2160 

13 TTTGCOGTGA ACCAiSflGCCG OCTGCTGGTG TGn^CTGG GCAACCCCAT GAAG6CAGGA 2220 

GCCAflgTGT GGGGTGGCCT TC60TTTACA QTCCCTCATC ^TOCGGGACAC XAAOAAAACC 2280 

ATC CT CrrTTG ACTTCCAOAT OCTCAGCAAG AATCTCAACA ACTCOCAAAG CGAC3GTGGTT 2540 

TCCTTTCQGC TCTCCGTGGA GGCTCAGGCZC CAOQTCACCC TGAAOGGTGT CTCCAAGCCT 2400 

GAQGCAGTGC TATTCCCAGT AAGOGACTGG CATCX3CXX3/US ACCAQCCTCA GAAGGAGGAfi 24fi0 

JXl GACCrOGQAC CTGCTGTCCa C!CATQTCrAT GAGCTCATCA ACCAAGQCCC GAQCTCCATT 2520 

AQCCAGGGXG TGCIGGAACT GAGCTGTCGC CAOGCIXlTGG AAGGTCAGak GCTOCTATAT 2580 

GTGACCA0AQ TTAOSOGACT CAACTOCACC ACXAATCACC CCATTAACCC AAAGGQCCTG 2640 

OAGTTGGATC CaSAGOGTTC CCTGCAOCAC CAGCAAAAAC GGGAAGCTCC AAGCJCGCAGC 2700 

TCTGCT3'CCrr CGGGACCTCA GATCJCTQ/UUl TQCCCGQAGQ CTGAGTGTTT CaGGCTGpac: 3760 

TGTGAGCTOG GGCCCCTQCA CCAACAAGAG AGCCAAAGTC TGCAGTTC3CA TTTCCGAGTC 2820 

TGGGCCAAGA CTTTCTTGCA GCGGGASCAC CAGCCATTTA GCCTGCAGTG rCGAGGClCTQ 2880 

TACAAAGCCC TGAAGATGOC CTACQQAArC CTGCCTOGGC AGCTSOCC3C3V AAAAGAGCGT 2940 

CAGGTGGCXa, CAOCTGTGCA ATGGACCAAG GCAGAAGGCA QCTATGGOGT CCCACTGTGa 3000 

^ ATCATCATCC TAGCCATCCT GrTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 

JU TACAAGCrmS 6ATTCTTCAA AOGCTCCCTC CCATATGGCA COSCCATCaGA AAAAGCTC3US 2120 

CTCAAGGCTC CAGCCAOCTC TGATGCCTOA 3150 
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Geq ID NOi C3 X>HA Sequence 

Kuclelc Aci,d Access ion #: MMjO 0242 1.2 

Coding sequence $ 1..1410 " 



1 XI 21 31 41 51 

^1)111 

AaxsCACAOcrr ttcctccacp scrGCTQcrq tTrorrcrGGG gtgtggtgtc acrcagcttc 6o 

^Kl CCAGCGACTC TAGAAACACA AGAGCSUiGAT GTGGACTTAfi TCCAGAAAXA CCIGGAAAAA 120 

TACTACAACC TGAAGRATGA rEGQSAaGCAA OTCGAAAASC GGAGAAATAl^ TOaGCCAGTG 180 

OTTGAAAft&T TGAAGCaAAT GCAQGAATTC TTTGOeCTISA AAOTOACTGG GAAACC3VGSRT 240 

GCTGAAACX:C TGAAGGTGAT GAAQCSU3CCC AGATC3TGGAG TGCCTGATGT OGCTCAGTTT 300 

. GTCCTCACTG AGGGOAACCC TCGCTGGGAfi CAAACACATC TQACCTACftG GATTGAAAAT 360 

TACAOSiTCAG ATTTGCCAAO AGCWGATOTG GAOCATGOCA TTGAQAAAGC CTXCCAACTC 420 

TGGft STAAT g TCAGACCTCT GACATTCACC AAGGTCTCTQ AGGGTCAAGC NSMSKCCKSQ 480 

AlATCTTTTG TCAG GSSBiaft. TCATCBGGAC AACTCTCCTT TTGATOBAO: TGG&GGAAAT 540 

CTTGCTCATG CTTTTCAACC AGGCCCAGGT ATTCSGAOOOT ATGCTCATTT TGATSAAQAT 600 

«^ GAAAOGTOGA CCAACAATTT CAOAGAGTAC AACTTACATC GTGTTGCOGC TCATGAACTC 660 

i>U GOOCATTCTC TTGQACTCTC OCATTCIACT GATATCGQGG CTTTGAT6TA COCTJieCTAC 720 

ACCTTCAQTQ GTGAT6ITCA GClAfSCTCna GKIGACATT6 ATGGCATCCA AGOCATATAT 780 

GGAOGTTCCC AAAATCCTGT CCAGCGCATC GGGOCAOlAA CCCCAAAAGC ATGTGACAGT 840 

AAGCTAACCT TTOATGCTAT AACTAjCQATT OQGGGAGAAG TGAT6ITCTT ZAAAQACAGA 900 

TTCTACATGC GCACAAATCC CTTCTACCCG GAAGTTGAGC TC3VATTTCAT TTCTGTTrrC 960 

I^CCftCAAC TGCCAAATGG GCITGAAGCT GCTTACQAAT TTGCCGACftG AGAI!rGAAGTC 1020 

GOGTTTTTCA AAGGSAAXAA €3TACTGG0CT GTTCaGQGa«! AGftATQaJOCr ACROGGaORC lOflO 

CCCftAGSA^ TCEACAGCTC CTXTGGCTTC CCXA<?AACTG TGAAGGATAt OGATGCTOCT 1140 

CTTTCTGAGG AAAACACTGO AAAAACCTAC TTCTTTGTTG CEAACAAATA CTG6AGGTAT 1200 

GATGAATATA AACQATCTAT GGATCaZftSGT TATCCCAAAA TGATAGCACA TGACTTTCCT 1260 

OU GGAATTOOOC ACRAAGTTGA TOCAGTTTTC ATGAAAGATG GATCXTTTCTA TTTCTTTCAT 1320 

GfSAACAAGAC AAHACSAATT TGATCCTAAA AGGAAGAGAA TTTTGACTCT OCnSAAASCF 1380 

AATAGCXGGT TCAACTGGAG GAAAAATTAG 14X0 



Ssq ll> NO: C4 DMA Sequence 

Ntideic Acid Acceefilon #< Bob eequonce 

Coding sequence > l.,L410 



1 11 21 31 41 SI 

7A I I I 1 I 1 

/V ATGCaCAGCT TTOCTOCACT GCTGCTGCTQ CTQTTCTGGG GTCTOSlGTC ACACAGCTTC 60 

CCAGOaACXC TAGAAACACA AGAGCAAGAI GTGGACTTAS TOaGAARTA CCTGGAAAAA 120 

TACKACAACC 7GAAGAAT6A TGOQAOQCAA GTTGAAAAGC GGAGAAATAG TGGCCCAGTG 180 

QTTGAAAAAT TGAAGCAAAT GCAGGAATTC TTTGGGCTGA AAGTGACTGG GAAACCAfiAT 240 

— - GCTGAAACGC TGAAGGTGAT SAAGCAQCCC AGATGTGGAG TGCCIGATGT GGCTCAGTTT 3O0 

/3 GTCCTCACTG AGGGOAACCC TCQCrGGGAfi CAAACACATC TQAOCTACAG GATTGAAAAT 360 

TACAOGCCAG ATTTGCCAAfi AOCAQATOTG GACCATGCCA TTGAGAAAGC CXTCCAACTC 420 

TGGAGTAATG TCACACCTCT GACATTCACC AAGGTCTCTG AjQGOTCAAGC A6ACATCATG 480 

ATATCTTTTG TCAGGGGAfiA TGATOGOOAC AACTCTCCTT TTGATGGACC TGGAGGAAAT 540 

Cn^GCTCATa CTTTTCAACC AGGCCCAGGT ATTGGAGGQG ATGCTCATTT TGATGAAGAT 600 

OU C5AAAG6TGGA CCAACAATTT CAGAGAGTAC AACTTACMC GTGTTGOGGC TCAIGOCCTC 660 

GC5CCATTCTC TTOGACTCTC CCATTCTACT GAXATCQOGG CTTTGATOZA CCCXAGCTAC 720 

ACCTTCAGTG GTOATGTTCA GCTAGCTCA6 GATSACATTO AIGGCATCCA AGGCATATAT 780 

OGACGTTCCC AAAATCCTGT CCAGCCCATC GGCCCACAAA CCCCAAAAGC ATOTGACAGT 840 

AAQCTAAGCT TTGAIPGCrAT AACIACGATT GGGGGAGAAiS TGATGTTCTT TAAAGACAGA 500 
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TTCTACRTGC GCACAAATCC CTTCTAOCQB GAftGTTGAGC TCAATTTCAT TTCTGTTTTC 960 

TGGCCACAAC TGCCAAATGG GCTTGAAGCT GCTTAOGRM TTaCCC3AC3\G AGftTGAAGTC 1020 

CQQTTTTTCA ARGGCSaATAA GTACTOGGCT GTTCAQGGAC W3AATG1GCT ACACOQftTAC 1080 

CCCA2VGGACA TCTACAGCtC CTTTGGCTTC CCXAJSAACTG TGAAGCATAT GGATGCTGCT 1140 

CITICTGAGG AAAACftCTGG AAAAACCrrAC TTCTTTQTTG CrAACAAATA CTBGACQTAT 1200 

0AT62\ATATA AftCGATCTAT GGATCX^AGQT TATCCCAAAA TGATAGCACA TGJ\CITrCCT 12 GO 

GGAATTQGCC ACAAAOTTGA TGCASTTTTC AXGAAAGATG GATTTTTCTA TTTCTTTCAT 1320 

06AACAAQAC AATACAAATT laATCSCTAAA AGQAAOAiSAA TZTTGnCTCI CCACAAAGCT 1380 

AATAGCTGGT TCAACTGCAG GAftAAATTAG 1410 

Seq ID NO: C5 DNA Sequence 

NQCleic Acid Accession #: KM_014331.3 

Coding aequances 1«.15D6 ^ 



1 11 21 31 41 SI 

I 1 1 I I I 

ATGGTCAGAA AGCCTGTTGT GTCCACXaTC TCCAAAGGAG GXTAOCTGCA GGGAAATOTT 60 

AAC6GGACGC TQCCTTOCCT QGGCAACAAQ f^M^f^HCX^ GGCAGQAQAA AGTGCA5CT6 120 

AAGAGGAAAG TCACTTTACT GAGGGGAGZC TOGaiTTKICA TTGQChCCAI CATTGGAfiCA IBO 

ZV GGAATCTTCa TCTCTOCTAA GOGOSTGCTC CAGAACAOGG GCRGCGTOGG CRTGTCTCIG 240 

ACCATCTGGA C3GGTGTGTOS GGTCCTGTCA CTATTTGGAG CXTTGTCTTA TGCrSAATTG 300 

GGAACAACTA TAAAOAAATC TGGAGGTCAT TACACATATA TTT17GGAAGT CTTIGGTCCA 360 

TTACCAGCTT TTGTACGAGT ClTGGGTGGAA CTCCTC&XAA TAOGCCCCGC AGCXACIGCT 420 

6TGATATCCC TGGCATTTGO ACOCTACATT CTGOAACXAT TTTTTATTOL ATGTGAAATC 480 

CCTQAACTTO CX3ATCAAGCT CATTACAGCT GTGGGCATAA CTGTAGTGAT G6TOCTAAAT 540 

AGCATGAGTG TCASClGGAiS CGCCC3GC3ATC C3U3ATTTTCT TAACCTTTTG CAAGCTCACA 600 

GGAATTCrOA TAATTATAGT CCCTGGAGTT ATCGCftGCTAA TTARAG6TCA AAOGCaSAAC fi€0 

TTTAAAGAOG CGTTTTCAQG iiAGAGATTCA AGTATTAOSC G6TT6CCafaC!T QGCTTTTTAT 720 

TATGOAATQT ATQCATATGC TGGCTGGTTT TACCTCAACT TIWA'ACIW AGAA6TAGAA 780 

AAOCCTGAAA AAACCATTCC OCTTGCRATA TGTAT&TCCA TGGCCAlTCGT CACCATTQGC 840 

TATGTQCTGA CAAATOTOaC CrCACTTTAGG ACCATTAATQ CTGAiSOAQCT GCTGCTTTCA 900 

AATGCftGTGG CA6TGAC3CTT TTCTGftGCGG CTACTGGGAA ATTTCTCATT ACCAGTTCCXS 9 SO 

ATCTTTQTTG CCCTCTIXTO CTTTOOCTCC ATQAacaGTO G7GTOTTTGC TGTCTOCAGG 1020 

TTAITCTATG TTGOGTCTCG AGAGGGTCftiC CTTCCAGAAA TCCXCtCCM: ^TTCATGTC 1080 

CGQVAGCACA CTCCTCTACC AQCXGTTATT QTTTT6CACC CTTTGACAAT GATAAT6CTC 1140 

TTCTCTGGAG AOCTOGACAG TCTTTTGAAT TTCCrCftGTT TIGOCAGSTG GCTTTTEATT 1200 

GGGCIOGCAG TTGCTGQGCT GATTTATCTT COATACAAAT GCCCAGATAT GCATOGTGCT 1260 

TTCAAGGTGC CACTGTTCAT OCCAGCITTG TTTTOCTTCA CATGCCTCIT CATGDTTGCC 1320 

ciTorcccTCr attcooaccx: atttasiaca ggqattggct tostcatcac TCTSACTGQA 13 bo 

GTCCCTGOGT ATTATCTCTT TATTATATGG GACAAGAAAC CCftGGIGGfIT tEAGAATAATG 1440 

TCtlGAGAAAA TAACCAOAAC ATTACAAATA ATACTQGAAG TTGTACCASA AGAAGATAAG 1500 

TTAT6AACTA ATGGACTTGA GATCTTGGCA ATCTGOOCAA GGGGAGA£S^ AAAATAQGOA 1560 

TTTTTACTTC AITTTCIQAA AGTCTAOAOA ATTACAACTT TGGTGI^XAAA CAAAJUSGAGT 1620 

CAGTTATTTT TATTCATATA TTTTAGCATA TTOGRACTAA TTTCTAASAA ATTTAOTTAT 1680 

AACTCTATOT AGTTATAGAA AGTGAATATG CAGTTATTCT ATGAGTCGCA CAATTCTT6A 1740 

GTCrCIGAlA CCTACCTATT GOOGrTAQGA GAAAAGACTA GACAATTACT ATGTGGTCAT XQOO 

TCTCIACAAC ATATGTTAGC A03GCAAAGA AOCTTCAAAT TGAAGACIGA GATTTTTCT6 1860 

TATATATQGG TTTTGIAJiAG ATGGTTITAC ACACTAC?Sfi& TGTCXATACT OTGAAAAGIO 1920 

TTTTCAATTC TGAAAAAAAG CATACATCAT GATTATGGCA AAGAGGAGAG AAAGAAATTT 1980 

ATTTTACATT GAGATTGCAT ^tGCTTCCXSCT TAGATAOCAA rtTTAflATAAC AAfljtaCTCRT 2040 

SCTiHi'AATGO ATTATACCCA QAGCACTTTG AACAAAGGTC AOTGGGGATT GTTGAATACA 2100 

TTAAAGAAGA 6TTTCTZ«3QG GCTACTGTTr ATGAGACACA TGCAGGAGTT ATGTXTAAGT 2160 

AAAAATOCIX GAGAATTTAT TATGTCAGAT GTTTTTTCAT T(ATTATCAO GAAGTTTTA6 2220 

TTATCTOTCA TTTTTTTTTT TCACATCAGT TTGATCAQGA AZUSTGTATAA CACAXGITXAG 2280 

AGC MiGft BTT AjSTTIGGTAT TAAATQCTCA TTAGAACAAC CACCTGTTTC ACTAATAACT 2340 

TAaSOClGAT GAGrrCTATCT AAACATATGC ATTTTAAGCC TTCAAATTAC ATTATCAACA 2400 

TGAGMSAAAT AACCAACAAA GAftOATOTTC AAAATAAIAG TCCCATATCT GTAATCATAT 2460 

CTA CATG CAA TOTTAOTAAT TCTGAAGTTT TTIAAATTTA TGGCT A TTTT TACAjCGAXGA 2520 

TCRATTTTGA CAGTTTGTGC ATTTTCTITA lACArCTTTAT ATTCTTCrGT TAAAATATCT 2580 

CTTCauSATOA AACTQTCCM ATXAATTAGO AAAAGGCATA TATTAACATA AAAATTGOU^ 2640 

AAQAAATGTC GCTGTAAATA AGSOTTACAA CTGATGITTC TAGAAAAXTT CX!ACITCSAT 2700 

ATCCAG6CIT TGTCZUaTAAT TTGCACftCCT TAATTATCAT TCAACTTGCA AAAGA0ACAA 2760 

CIGATAAGAA GAAAATTGAA ATGRGAATCT GTOGATAAGT GTTTGTGTTC AGAAGAJGTT 2820 

^_ OTTTTGCCAG TATTAGftAAA TACTeiGAGC C3GC5QCATGGT QGCTTACMC TQTAATCCCA 2680 

OJ GCZUmrrTGOQ AOGCrGABGa GQTGGATCAC CT6AGGTCGG GAGTTCIAGA CCZ^GCCTGAC 2940 

CAACKTGGAG AftACOCXIATC TCXACTAAAA ATACAA;^T:r AGCTGGGCAT GOTGGCACnT 30 00 

GCIGeSAATC TC3U3CTATTQ AGQAGGCTGA GGCAQGA6AA TTGCITOAT^ OOGGGAGGCG 3060 

iaAOGTTGCAG TGAGCCAAGA TTGCACCACT GTACXCCAGC CTGGGSOACA AAGTCAOAGT 3120 

WaCltXM, AAAAAAAAAA AAAA 3144 



Seq IB NOs C6 ran. fiexjuence 

Nucleic Acid Acoeesion «s HM_003246.1 

Coding seqiiezxcei 112.. 3624 

3. 11 21 31 41 51 

I I I I I I 

GGACGCACAG GCATTC3CCCE CGCCCCXCCA GOCCTOQCCG CXTCTCGCCAC CGCTCOCGGC 60 
CQCCX30GCTC C3GGTACACAC AGGATCXXTG CTGGGCACCA ACAGCTCCAC CATGGQGCKS 120 
GCCTGGGGAC TAGGCGrOCI GTTCCTGATG CATOTGTQTa GCACCAAOCG CATTGCAGAG 180 
TCTGGC3GGnG ACAAGAGGST GTTTGACAXC TTTGAACTCA CCGGGOCOGC COGCAAGGGG 240 
TCVGGGOSGC GACIGGTGAA GGGOCCOGAC CCITCCAGGC OUaCXTTCOa CATGGAGGAT 300 
GCCAAOCIGA TCCCCX:CTG7 GCCTGATGAC AAGTTOCAAG AOCTGGTGGA TGCTGTGOGG 360 
GCAGAAAAGG 6TTTCCTCCT TCTQGOITCC CIGAfiGCASA IGAAGftAGAC CX^GGGGGACSS 420 

1200 
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CTGCTGGCCC TGGAGOGGAA JU3ACCACTCT GGCCAI3GTCT TCAGCGTGGT C3TCCAATGGC 480 

AAI5CK3GGGCA CCCTGGACCT CAGCCTOACC GTCCAAQOAA iW3CAGCA0GT GGTGTCTGTG 540 

GAACAAGCTC TCCTGaCAAC CXSGCO^GTGG AAGAGCATCA CCCXGITTGT GCAQQAAGAC 600 

AGQOCCCAGC TeTACATOSA CrOTQAAAAG ATGGAfSAATB CTGAGTTGGA CGTCOCCRTC 660 

D CAAAGCGTCr TCACCAQSfSA OCTGGCCAGC ATCGCCAGAC TCC3GCATCQC AAAG<3GGGGC 720 

ATTTCCAGGG GQTOCTaCAG AATGTGJM3QT TTGTCTTTQ6 AACCACaCCaV 780 

GhAGACATCC TCAGGAACAA AGGCTGCTCC AGCTCTACCA GTGTCCXCCT CACCCTTGAC 840 

AACAAOgTQQ TGAATGGTTC CAGCCCTt3C3C ATCCGCACTA ACTACATTG6 CC!ACAftGACA SCO 

AAGQRCTTGC A/USCCATCTG CGGCATCTCC TG7GATV3AGC TOTCCAGCAT GGTCCTG6AA 5S0 

lU CTCaGGGGCC TGOGOiCcaT TGTQACCACG CTGCAGGACA GCSVTOCGCAA AGTGACTGAA 1020 

OAGAACAAAS AGTTGGCCAA T6AGCTG2VGG CGGCCTCCCC TATGCTATCA CAACaOAGTT 1080 

CnfiTACAGAA ATAAC3GAQQA ATQGACTC3TT GATAGCTGCA CrGftBTQTCA CTGTCAGAAC 1140 

TCA6TTACCA TCTGCMAAA GGTGTCCCGC CCCATCATGC OCIGCTCCAA lOCCS^CftOTT 1200 

CCTGATGGAC3 AATGCTSTCC TOSCTGTTGG CCCAGCGACT ClGGOQAOSV TGGCIGGTCT 1260 

ID CCATGGTCOG AGTOOACCTC CTGTTCTACG AGCTGTOGCA ATGGAATTCA GCAfiCGOGGC 1320 

03CTCCTGTO ATAGCCTCAA CAACJCBATOT GAGGGCTCCT CGGTCCAQAC ACGGROCTGC 1330 

CACATTCAJ3G AfiTGTOACAA AAGATTTAAA CRtSGATGQTa CCXGGAGOCA CTGGTCCCCJa 1440 

!rG6TCATCTT GTTCTGTG&C Arr6TG0TC3AT GGTGIGRICA. C9UV8(3A3GCS QCTCTOCAAC X5O0 

TCTCCCA6CC OCCIU3ATQAA TGGGKAACOC TGTGAAQGCQ AAC3C60QGGA GACCAAASCC 1560 

ZU TGCRAGAAAG ACGCCTGCOC CATCAATGGA GGCTGGOGTC CTTOSTCACC ATGGGACATC X620 

TGTTCTGTCA CCTOTGGAGG AGQSGTACAG AAACOTAGTC GTCTCTGCAA CAACCCXXSCA 1680 

CXrcCAQTTTG GAGGCRAGGA CTGCXSTTOOT GATGTAACSiiG AAAACCAKSAT CTCCftACftAG 1740 

CaOGACTGTC CAATTGATQCS ATGCCTGTCC AATCCCTGCT VTOCXSGGCG/t GAAGT-GTACT 1800 

^ _ AGCTACCCTG ATGGCftGCTS CSUlATSTBOT GCTTGTCOCC CTG GTJ ACAG TGGAAATGGC IB 60 

Z-> ATCCAGTGCA CAGATOTTGA TGAGTGCAAA GAAGTGCX^TG ATGCCTGCTT CAACCACAAT Z920 

QGAQAQCACC GOTGTGAGAA CAOGGAOCCC GQCTACAACT GCCTGCCCTG CCX30GCA0GC 1980 

TTCACX3GGCT CAdKSCGCTT CQOCCAGGGT GTOGAACAtCG CCACGGCCAA CAAACfiGOlXS 2040 

TGCftAGCCCC GTAACCOCTG GAGGGRTGG6 ACCOkCBlkCr GCAACAAGAA COCXMGTQC 2100 

AACTACCTGG GCCACTATttS CQftCCCCMa TACOSCTGOG ASTGCSkAaDC TOGCTAOGCT 2160 

jV GGCAATGGCA TC3V.TCTGCC3G GGAGGACACA GAGCTOGATS BCTGGOCCAA TGAGAACCTQ 2220 

GTOraCXTTGO CCAATGOGAC TTACCACTQC AAAAROGATA ATTGCCCC3VA CCTTOCCAAC 22 BO 

TCAGQGCftGG AflfiACTATQA CAAGGATQGA ATTGGTGATG C3CTSTGAIGA TGAOGATQAC 2340 

AATOATAAAA TTCGAGATGA CAGGOACaU^C TGTOCMTTCC AllXACAACCC AGCTCAGTAT 2400 

GACTATSACA QAIlATGATGr GQGAGACC3eC TOTSftCAACT GTCCCTftC&A CXaW3UUXX3V 2460 

J J GATCftGGCAS ACACAGAG^ CAATGOGOAA GGAI3AGGCCT GUSCTOCAiaA CATTGATGOA 3520 

GAOCSGtATCJC TCftATOAAOa GGACAACTQC CAGTAOSTCT ACAATGXGGA CC3«3AGftGAC 2580 

ACIQATATOG ATGGGGTTGG AGATCRQTQT GACAATTGCC OCtTOGAACA CAATCCGGAT 2640 

CAGCTGGACT CTGACTCAGA COSGATIGGA GATACCIOTa AGAACAATCA GGATATIQAT 2700 

GAAGATQOCX: AC3C!AGAACftA TCTOOACAAC TGTCOCIWrG TGCCCRATGC CaACCaGQCT 2760 

**\} GACCATGACA AAX^ATGOCAA GQGAGATGCC TGTGAOCSiCO ATGATG!A£AA OGATOOCATT 2020 

CCrGATOACA AGGACAACIG GRjaRCTCOTG COCMiTCCCG ACCAtaAAGOA CTCTGAGGGC 2flS0 

GATGGTC6AG GTGATGCCTG CAAASATGAT TTTOAI^CATQ ACAGK5T60C AGACATCQAT 2940 

GACATCPgrC CTGASAA:n?r TGAC3VTCA8r GAGACOQATT TCQQCCXaATT GCAGATGAST 3000 

. _ CCTCTGGACC CCAAAOGGAC ATOCCAAAAT GACCXIEAACr GOGTTGIACS CCATCAGGGT 30 60 

AAAOAACTOG TCCAGACTGT CAACTGTGAT CCTGGACTCQ CTGTAGGTTA TGAlrGAGO^ 3120 

AATGCTGTGQ ACTTCAGTGG CACCTTCnTC ATCAACAOCG AAAGGGAOQA TGACTATGCT 3160 

QQATTTGTCT TTGGCIACC3i. OTOCAGCAGC OGCITTTATG TTGTGATGTG GAAGCAAGTC 3240 

ACOCAGXCCT ACTGGGACAC CAAOCCCACG AGGQCTCAGG GATACIGX3GG CCTTTCTGTG 3300 

AAAQTTGTAA ACTOCACC&C A6GGCXTOGC GM3CM1CSGC GQAAGgOCCT GTG0C3U!ACA 3360 

3U GGAAACAGOC CTQOCXSiGaT GCGCAOCCTG trGGCATGACC CTOGTCACaX AOGCTOQAAA 3420 

Gfl3CTT<2ACXSa CCTACAGATG GOGTCTtMC C3WIAGGCCAA AGRCOGOTTT CATTAGAGTG 3480 

GTQATGTAT6 AAGGQAAOAA AATCATGGCT GACTCAaOAC OCATCTATGA TAAAACCTAT 3540 

GCTGGTGGTA QACTAGGGTT GXTTGTCTTC TCTCAAGRAA TGGTGTTCTT CTCXGAOCTG 3600 

AAATACGAAT GTAGAGATC!C CTAATCATCA AATTGTTOAT TGAAAGACTG ATCKEAAACC 3660 

DD AATQCTGOTA TT GCACC TTC TGGAACTATG GOCTTOAGAA AACOOCCftOG ATCaUTTTCTC 3720 

CTTOGCTTOC TTCTTTTCIG TQCTTGCATC AGTGTGGACT OCTAGAAOGT GOGAGCTGCX! 3780 

TCAAQAAAAT OCAGTTTTCA AAAACAfiACT CATCAGCRTT CAGCCTCCAA TGAATAAGAC 3840 

ATCTTOCAAG CATA^TAAACA AITaCTTTGS TTTCCTITTG AAAAAGCATC TACITGCTTC 3900 

AGTTGOSAfte GTOOCCATTC CACICXGOCT TTBTCACSU3A GCAGOSTQCT ATIQTOAGGC 3960 

OU CATCTCT 



65 



Seq ID noz C7 DisgA Sequence 

Nucleic Acid AcceSBlcan #s 104^002192 

Cbding 6eq^encel 66. .1366 " 



3967 



1 11 21 31 41 51 

L, ' 1 I I I 

rcCACACACA CARAAAACX:T GCGC6*rGAGG OaOGAGGAAA AGCAGGGCXHT TTAAAAAGGC 60 

AATCACAACA ACTTTTGCTQ CCAGQATGOC CXTGCxarrOG CTGAGAGGAT TTGIGXTGGC 120 

'V AAGTTG CTQG ATTATAGTGA QCSAOTTCXXSC CACCCXMGA TCOGAQOOGC ACAGOGOGGC 180 

CCOOmCIGT COGTCCTGTG GGCTSGCXZGC CXTTCCXIAAAG GATGTAOOCA ACICTCAGCC 240 

AGAGATGQTG GAGGCOGTCA AOAAGCACAT TTTAAACAXG CTGCACTTOA AGAAGAGACC 300 

OGATGTCACC CAOCCGOTAC OCAAGGOGGC GCTTCTGAAC QCSQAXCAGAA AGCTT^TGT 360 

— ^ 0OGCAAAGTC GGGGAGAAG6 GGTATQTGOA 6ATAGAGGAT GAOWTTOGAA GGAGGC3CAGA 420 

/J AATGAATGAA CTTATQGAeC AGAOCTCGGA GATCATCACH TTTGCOGAGT CAGGARCAOC 4 BO 

CAGGAAGACS CTGCACTTCG AQATTTCCAA GGAAGGCAGT GACXZTGTCAG TGGTGGAGOG 540 

TGCAGAAGTC TGGCTCTTCC TAAAAGXOCC CAAGQCCAAC AGGAOCAGGA CCAAAGTCAC 600 

CATCOGCCTC TTCCAGCAGC AGAAGCACCC GCAGGGCA^^ TTGGACSUZAO GGGAAGAGSC' 660 

GGAGGAAGTG GGCTTAAAGG GGGAGAGGAG TGAACTGTTG C3CTCTGAAA AAGIAQTAGA 720 

oU CSSCrCGGAAG AGCACCTG6C ATGTCTTCCC TOTCTCCAGC AGCATOCaflC GGTTQCTOGA 7 BO 

OCAaOQCAAg AB CTOOC TGG ACGTTOGGAT TGCCTGTGiAG CAGTGCC3W3G AGTiGTGGOGC 840 

CaOCXTGQTI CTGCTGOaCA AGAAQAAGAA GAAAGAACSAG GAGGGGGAAG GGAAAAAGAA 900 

GGSCQGAGGT GAAGSTG0G6 CnC3GAGCAGA TGAGGAAAAG GAGCAGTOGC ACAGACCTTT 960 

CSCTCATGCTG CAGGCG08&C AGTClGAAGA OC3^iCCX:TCAT CBCCGGOSTC GGCQGGGCTT 1020 

1201 
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GGAGTGTQAT GGCAflOGTCA ACATCTGCTG TAAGAAACftG TTCITTQTCR GTTTCAAGQA 10 BO 

CnrCGGCTGG AATGACTGGA TCATTC3CTCC CTCIGQCTAT CftTGCC&ACE ACTGCGASGG 1140 

TGAGTTSCCCa AGCCATATAG CAGGCAOGTC CSGGOTCCTCA CTGTCCTTCC ACTCAACA6T 1200 

CATCRACCAC lAOCGCATGC OGOOCCATAG CCCCTTTQOC AACCTCAAAT OGTGCTGTGT 12 60 

GCCCACCAAG CTGAfiACCXZA TGTCCATtSTT GTACTATGAX GATGQTCAAA ACATCATCAA 1320 

AAAGGACATT CAORACATQA TCGTGGAGGA GTGTC3SQTGC TCATAGASn QCCCAGCCCA 13a 0 

GGGGGAAAQG GAGCAAGAGT TGTCCAGAOA AGACAGTGGC AAAATGAAGA AATTTTTAAG 1440 

OITTCTGAGT TAACCfiOAAA AATAGAAATT AAAAACAAAA CAAAACAAftA AAAAAAACAA 1500 

iWiAAAACAA AAGTAAATTA AAAACAAACC TGATGAAACA GATGAAACAG ATGAAGGAAG 1560 

ATGTGGAAAT CTTAGCCTGC CTTAGCCftfiG GCTCAGAGAT GAAQCftGTGA AGAGACAGAT 1€Z0 

TGGGfiOGGAA AGGOAGAATG GTGTACOCTT TftTTrCTTCr GAAATCACAC TGATCACATC IfiSO 

AGTTGlrTTAA ACGGGGTATT GTCCT^TCCC CXXTTGfijGGT TCCCTTOIGA GCTTQAATCA 1740 

ACCAATCTGA TCrEGCAGTAG TBTOGACTAG AAO^ACCCM ATAGCATCTA GAAAGCCATO 1800 

AGTTTGAAAG GC3CCCATCAC AGGGACTTTC CTAGCCTAAT 1840 



10 



20 
25 



60 
65 
70 
75 



Seq ID NOt C8 DMA Sequence 

Nuclaic Add Accession 1IMJ000O95.1 

Oodlsg sequence! 26.. 2299 ^ 



J ^1 21 31 41 51 

^ I 1 1 I I 

CAGCACCCRG CICCXXXSCCA CCGCCATGGT CCCOGAIMC GCCTGCGirrc TTCTGCtCAC 60 

OCrGGCTGCC CrOGGCGCOT CDGGACRGGG CC3MQAGCCOG TTGQQCTCAG ACCrOOOCCC 12 0 

GCftOATGCTT CC3GC3AACTCC AGGAAACCAA GGOGGOGCTO CW3GAOGTGC QCGACXGGCT 180 

GCGGCaGCAG GTCAGGGAGA TCAOGTICCT GAAAAACA06 GTGATOOAOT GTGACQOCSTO 240 

CXSGGATGCAG CftGrC3«3TAC GGACCJOGOCr ACCCAGOGie CS3GCC0CTCC TC3CACTGG6C 300 

GOCOGGCTTC TOCTTCCCGG 0CX3TGGCCTG CATCCaGACG GJ«3AGOC5QCJG GCOGCXGCGQ 360 

COCXITGCCa: GCGGGCTTCA OGGGCAACGQ CTOGCACTGC ACCGACWTCA ACXSAOTGCAA 420 

OU C3GCCCACCCC TQCTTCCCCC GAGTCOGCTG TATCAACACC AGCCCGGGGT TC50GCTGGGA 4B0 

GGCTTGCC03 COGGGGTACA GOGQCXXCAC CCAOCRGGGC GTOGOGCTGG CTTTCOCCAA 540 

GGCCRACAAG CflGOTTTGCA GGGACATCAA OSA6T8TGAG ACOGGGCAAC ATAACTGOGT 600 

COCCARCTCC GTGTGCATCA ACAOCCGGGO CTCCTTCCAG TOC3GGCX:CGT GOCAGCCCGG 660 

CTTOGTQGGC GACCAGGOGT COt3GCTGCCA GOGC3GGOC3CA CAQOSCTTCr GOCCCGACGQ 720 

JD CTCGCCCAQC GAGTGCCAfiG AGCATGC3ifiA CTGCGTCCTA GAOCaaCGATC GCTCOOOGTC 7B0 

GTGCGTC3TGT OQa3TTQGCT OQGCaOGCAA OSQG&TCC5CTC IGTGGTOGCQ ACACTGACCT 840 

AGACGGCTTC CCGGAOGAGA AGCTGaSCM CCOSGROOOG CaflTOCOGTA AGOACftACTG 900 

C30TC3ACTGTG CCCAACTCftG GGCAGGSJSGA TGTGGivaCGC GATCGCATCG GAGACGCCTG 960 

CGATCOSGAT GCOGACXSOGG AOGGGGTCCC CftATGAAAAG GACAACTGCC CGCTGQTGCG 1020 

HXJ QAACCCAGAC CAGCGCAACA OQGAOaAGGA GAftGOXSHGGC GATGCGTGCX3 ACAACIGC3CQ 1080 

GTCCCMAAG AACGACGACC AAAAGGACAC AGAOaGGAC taacOBGOGOG ATGCQTOOG& 1140 

CQACGACATC GRCC5GCGACC OGATOaSC3A CCAGQOCSBAC AACTGCCCa'A GGGTACCCRA 1200 

CTC2tf5ACCAG AAQGACAOTG ATGGCGATGG TATAGGG6AT GCCTOTGACA ACTOTCCCXS^ 1260 

. QAAOAGCAAC CCGGATCAGG CJQQATGTGGA CCACOACTTT GTGGGAGATQ CTTGTGACAQ 1320 

^tD Q6ATC3UIGRC CAGGATGGAG AOGQAC&TCA GGRCTCTCGQ GACRACTGTC CCACGGTGOC 1380 

TAACnGTGtX: CAGGAGGACT CAGROCAOSA TOGCXaGGGT GATGCCXOCQ ACGACGAGOA 1440 

OGACRATGAC GGAGTCCCTG ACAfiTCSBGOA CRACieOaQC CTOSTGCCEA ACGCOQGOCA 1500 

QQAGGAOSOG GACAGGGAOS flCSOTGGQOGA CGtGTGCCAG GAOaACrTTO ATGCSGACAA 1S60 

GGTGGXAOAC AAGATC3QA0G TGTGTCOGGA QAAOGCTQAA GTCAOSCTCR COaACTTCAG 1520 

GGGTQflCQCG CAGATlOAOC OCAACIGGGT 1680 

55 



GGTGCTCAAC CRGGGAAGGG AGATaSOSiai GACAKTGAAC AflOQBWXCRG GCCTOaCTOT 1740 

QQGTTACACr QCCTTCAAT6 GCQTOaaCTT OSAGGGCAOG TTOCAMMA AC3M3QGTCAC 1800 

GGATGACGRC TATGOOGQCT TCATCTTTGG CTACXaGGAC AfiCTOC3W3CT TCTACJGTGGT i860 

CAOOnOGAftG CRGATGGAGC AAACQTATTG GCAGGOGAAC COCTTCCKTa CTGTQGCOSA 1920 

QCCTOGCKTC CAACTCAAGO CTGTGAAGTC ITXCCACAGGC CCCXSGOahAC AGCTGOGGAA 1980 

OGCTCTOTGG CATACS\GanG ACACASRGTC CCAGGTGOaQ CTGCTGTOOA AGaACC0GC3G 2040 

AAAOGTGQGT TGGAAGGACA AGAA6TCCTA TCeTTGGTTC CTGCftOCaOC GGCCCCAAGT 2100 

GGGC TACATC AGGGT&OGAT TCTATGaOGG CCCTGAGCTO GTGGCCGACA GCAAOGTGGT 2160 

CTTOCMACA ACCATGOQGG GTGGCOGCCT GGGQOTCTTC TGCTTCTCCC AGGAGAACAT 2220 

CATCTOGSOC AACCTQCCTT ACXJGCXGCAA TGACAOCATC CCRGRGGACT ATGAOAOCCA 2280 

TCAQCTGCGG CAAGOCTAGG GACCAOGGTG KSGRCfXSBCC GGATCSAiCAGC CACCCTCAOC 2340 

OOGGCTGQAT GGGGGCTCTO CAOC5CAGOCC AflGGGGTGGC CSTOCTOftGG GGGAAOTOAG 2400 

AAGGGCTCAS AGAQOACAAA ATAAAGTQTG T6TGCAGGQ 2439 



Seg ID nox C9 fiHA Sequence 
Nucleic Acia Accessioa #s XH D57014 
Coding sequence B 143., 874 ^ 

1 r r f r r 

GGGAOGGAGA QAGGCGOSOG GQTQAAAGGC GCATTOATGC AGOCT6C3QGC GGCCtCGOAG 60 

OGOOGOeORG OCaGAOSCTG AOCAOGTTCC TCTOCTC66T CTCCTCCGCC TCCAGCTOCG 120 

CGCPGOOCGG CAGCX3GGGAG CCATGOGACC CCAGGGOCXX: G0CX3CCTCCC OSCAGCGGCT 180 

CCGGSGCCrC CTGCTOCTOC TGCTGCTQCa GCTGCOOGOG CCaTOSAGOS CX3CTGAGAT 240 

CCCCAAGGGG AAOCAAAAOG OQCAGCTCOG GCAGAGOGAG GTGGTQGAC3C TGIATAATOa 300 

AAISTGCTTA CAAGGGCa:»f3 CRGGAQTOCC TOGT06A6AC GGQAGCCCTG GGGCCAATGG 360 

CATTCG6GGT ACAQCTGGGA TC3C!CAGGT0G GGAIOQATTC AAAGGAGAAA AGGGQGAATO 420 

OV TCTGAGGGRA AGCTTTGAflG AGTCCTCGAC ACCJCAACTAC AAGCAOmSTT CATX3GAGTTC 480 

ATTOAATTAT GGCATA GATC TTGGGAAAAT TGCGGAGTGT ACATTTACRA AGAO^CGTTC 54Q 

AAATAfiXQCT CXAftGAGTTT TGTTCftGXGa CTCACTTOSG CTftAAATGCA GAAATOCATG 600 

CTGTCAGCGT TGGTArrrCA CATTCAATGG AGCTGAATGT TCAQGACJCTC TTCCCATTGA G60 

AGCTATAATT TATTTGOACC AAQOAAGCCC TOAAATGAAT TCSkACAATTA ATATTCATOS 720 

1202 
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CACTTCITCr OTGGAAGOAC TTTGfXGMGa AA.TTG<>rGCT G6ATTAGXGG A.TGTTt3CrAT 780 

CTOGGTTGGC ACTTSTTCAG ATTACCCRAA A66AQATGCT TCTACTCSQAT tSGAATTCRGT 840 

TTCTCQCftTC ATTATTGAA6 AACTACCAAA ATAAATGCIT rrAATTTTCAT TTGCTACCTC 900 

TTTTTTTATT ATGCCTTGGA ATGOTTCRCT TAAATGACAT TTTAAATAAn TTTATCrTATA 9^0 

!> CarCTOAATG AAAAGCAAAG CTAAATATGT TTACAGACCA AASTGTGAnr TCACACTGI^ 1020 

TTTAAATCT& GCATTATTCA TTTTGCTTCA ATCAAAAGTC? ITTTTCAATAT TTTTTTTAGT 10 BO 

TQQTTAOAAT ACTTTCTTCA TACsTCACATT CTCTCAACCT ATAATTIOeA ATATTGTTGT 1140 

GGTCTTTTTGr TTTTTCTCTT AJ3TATAGCAT TTTTAAAAAA AXATAAAAOC TAOCftATCTT 12 OO 

TGTACAATTT GTAAATQTTA AOAATTTTTT TTATATCTOT TAAATAAAAA TTATTTCCAA 1260 

iU CAACCTTAAA AAAAAAAAAA AAAA IZ84 

Seq ID NOs CIO JWA. Sequence 
Nucleic Acid Accession #i ini_00322? 
Coding sequence: 41/, 295 '~ 

1 11 21 31 41 51 

I t I 1 I I 

ATdJCTOACT OGGGGTOGCC TTTGGA(3CRG 7USAGGAGGCA ATGQCCMCA TGGAiGAACAA 60 

GGT6ATCTGC GCCCTGGTCC TGQTQTCCAT GCTOTCCCTC GGCACCCTG6 CCGAGGCCCA 120 

QACAGAGACG T6TACAGTGG COCCCOGTGA AAGACAGAAT TQTQGTTTTC CXGGTGTCAC IBO 

GCGCTCCXAG TGTGCAAATA AO0GCTGCTG TTTOGAOGAC ACOGTTOGTG OG6TCCCCTQ 240 

GTOCTTCTAT CCTAATACCA TOGAOGTCXX: TCCAGAAGAB GAGTQTQAAI* TTTAGACIACr 300 

TCTGCAGG6A TCTGCCTGCSV TCCTGACGQO OTGOCOTCCX: CAGCAOGGTG ATTAGTCCCA 360 

GAOCTajGCT GCCACCTCCA CCGGACACCT CAGACACGCT TCTGCRGCTG TQCXTTCGGCT 420 

CACAACACAG ATTGACTGCI CTGACTTTQA CTACTCAAAA TTGGCCTAAA AATTAAAAGA 480 

GRTCOATATT AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 540 

6eq ID ETO; Cll DHA Sequence 
Mudeic Acid Accession *s £IH_015419,1 
Coding sequence ? i.«8487 " 



15 
20 
25 
30 



35 
40 
45 
50 



1 11 21 31 41 51 

I I I I I i 

ATGCCCAAGC GOQCGCACia QQQCSGCSCCTC TCCQTGGTGC TQATCXTTGCT TTGGGGCCAT 60 

CCGCGAGTGG CGCTGGCCTG CCGGCATCCT TGTGCCTGCT ACGTCCXXM OGAGGTCCAC 120 

T GCnO BBJCC GAtCCCTGGC TTCC2XIGCCC GCTGGCATTG CTAGACACGT GGS^AAGAATC 180 

AATTTGGQGT TTAATAlGCAT ACAGGGOCT6 TGAGAAAGCT CATTTGCAGQ ACTQACCAAG 240 

TTGGAGCTAC TTATOATTCA OBGCAATGAQ ATOCCAAGCA TCOCGGATGG AGCTTTAACJA 300 

GACCrCAGCT CTCTTCAGGT TTTCAAGTTC AGCTACAACA AOCXQAGAQT QATCACAGGA 3€0 

CAGAOCCrCC AGSQTCTCTC TAACTTAATG AGGCTGCACA rTGACCACAA CAAfiATOaRQ 420 

TTOMCCAOC CTCAAGCTTT GAACGGCIXA ACXSXCTCIGA GGCTACTCCA TTTGGAAGGA 400 

AKTCTCCrOC ADCAGCTQCA CCCCAOCACC TTCTCCAOGT TCACATTTTT GGATTATTTC 540 

AQACTCTCCA CCATAAGGCA CCTCTACTTA GCAGAGAACA TQQTTAOAAC TCTTCCTGCC 600 

AGCATGCTTC GGAACATQCC GCTTCTGGAO AATCTTTACT TGCAGGGAAA TCCGTGQACC 660 

TGCGATTGTG AGATGAlSATG GTTTTTBGAA TGGGA3X3CAA AATCCAQAOG AATTCIGAA6 720 

TGTAAAAAGQ ACAAAGCTTA TGAAGOGGGT CAGTTGTGTG CAATSXGCTT CAGTCX»AAG 780 

AAGTTGTACA AACATGA6AT ACACAAGCTG AAOGACATGA Ci ' l ' tfiVAW A GCCTTCAATA 840 

GASTCCCCIC TGAGACnCSAA CAGGAGCAGG AGTATTGAGG AGGAGCAAGA ACAGGAAflAS 900 

GATGGTGGCA OCCAGCTCAT CCTGGAfiAAA TTOQiACTGC CXZCAGTQGAG CATCTCTTT6 960 

AAT ATGAC X3G ACGAGCAGGQ GAACATGGTG AACTTGGTCT GTGACATGAA GAAACCAAT8 1020 

GATGTGTACA AGATTCACTT GAACCAAACG GAXOCTCXAG ATATTOACAX AAATGCAACA 1030 

GrTGOCTTGG ACTTTGAOTG TCCAATGACC Ct^USAAAACT ATGAAAAGCT ATGGAAAXTG 1140 

ATAGCATACT ACAGTGAAGT TCCGGTGAAG CTACACAGAG AGCTCATGCT C»GCAAAGAC 1200 

_« CCCAGAGICA GCIACCAGTA O^GGCAGGAT QCTGATGAGG AAGCTCTTTTA CTACAiCAGGT 1260 

GTGAGAGCCC AiQATTCTTGC AGAACCAGAA TGGGTCATGC AGCCA!rCC!AT JU3ATATCCAG 1320 

CTGAACOBAC GTCAGAdSTAC GGGCAAGAAG GTSCCACTTT CCTACXACAC CCAGTAITCT 1300 

CAAACAAIAT OCACCftAAGA TACAASGGAG GCTCBGOSCA GAAGCTGGOT AATQATTGnB 1440 

CCTAGTGGAG CTGTGCAAAG AGATCAGACT GTCCTOOAAG GGGGTCCATG CCA6TTGAGG 1500 

TGCAAGGTGA AAGCTTCTGA GAGTOCATCT ATCTTCTGGG TGCTTCCAGA TGOCTCCATC 1560 

CIGAAAGCGC CCAT6GATGA CCCAGACAGC AAGTICXCCA TTCTCSIGCAG TGGCTOGCTG lff20 

AGGATCAAGT CCA1:GGAGCC ATCTSA C ' i ' CA GGCTTGTACC AGTGCMnGC TCAAGTGAGG ICaO 

GATQAAATQG ACOGCATGGT ATATAGQGTA CTTGUGCAST CICCSCTCCAC TCAgCCMSOC 1740 

GAGAAAGACA CAGTSACAAT IGGCAAGAAC CXAGOGGAGT aSGTGACATT GCCTTGCAAT 1600 

GCTTTAGCAA TACCCXSAAGC CCACCTTAGC TGGATTCTTC CAAACAGAAG QATAATTAAT 1860 

DD GATTTGGCTA ACACATCACA TGTATACATG TTGCCSU^AXG OAACTCTTTC CATOCCAAAQ 1920 

GTCCAAGTCA GTGATAGTGG TTACTACAGA TGHGEGSCTG TCAAGCAGCA AGG&GC3U3AC 1980 

CATTTTACGG TGGGAATCAC TbGTGAlCCAAG AAAOGOTCTQ QCTTGOCATC CAAAAGASGC 2040 

AGAOSCOCAG QTGCRAAOac TCTTTCCAGA GTCAGAGAAG ACATGGTGGA GGATGAAGOG 2100 

GOCTOOGGCA T0G6AGATGA AGAGAACACI TCAAGOAOAC TTCTGCa^TCC AAAGGAGCAA 2160 

/U GAGaT{7TTCC TCAAAACAAA GGATGATGCC ATCAAtGGAG ACAAGAAAGC CAAGAAAGGG 2220 

AGAAGAAAGC TGAAACTCTO GAAGCATTOQ GAAAAAGAAC CAGAGAOCAA TGafl^GCAGAA 22fiO 

OGTCGCAGAG TGTTTOAATC TAGAOGAAG6 ATAAACATGO CAAACAAACA GAXTAATCGG 2340 

GAGOOCTOGG CTGATATTTT AGCCSAAGTC OGTGGGAAAA ATCTOGCTAA OGGCACAGAA 2400 

GTACCtXCQT TGATTAAAAC CACAAGTCCT CCATCCTTGA GCCIAGAAGT CACAOCACCT 2460 

TTTOCTGCTG YTTCTOCGCC CTCAGCATCT CCTSTGCAGA CAGTAAOCAG TGCXGAAGAA 2520 

TCXrrCAGCAQ ATGTACCTCT ACTXG6XGAA GAAl^CACG TTTTSSGTAC CATTTCCTOI. 2580 

OOCAGCATGG GGCTAGAACA CAAGCACAAT GGAGTTATTC TTGT1X3AACC TGAAGlSAlCA 2640 

AfiCnCAOCTC TGGAQGAROT TGTTGATGAC CXTTCTGflSA ASACTGAGGA GATAACTTCC 2700 

ACTGAAGGAG ACCTGAAGGG GACAGCAGCC CCTACACTTA TATCTGROCC TTAUGAACCA 2760 

TCrtX^ACrC TQCACACATT AOACACAGTC TATGAAAAGC CCACOCATGA AGAGACGGCSV 2820 

ACAaAOGGTT GGTCTGCAGC AGAlXSTTGOA TCXSTCACCAG AC5CCCACATC CAGTGAGTAT 2880 

GAGCCXCCAT TGGATGCTGT CTCCTTGGCI GAGTCTGAGC CCATGC3y«3i CITTGACX3CA 294Q 

GATTTGGAGA CTAAGICAC3k ACCASATOAQ OATAAGATGA AAGAAGACAC CTTTGCACAC 3000 

CTTACCCCAA CCCCCACCAT CTGGSTTAAT GACTCCSUXTA CATCACAGTT ATTTGA06AT 3 DEO 
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TCTACTATAG GQGAACCACSG TGTCCXHiGGC CAATCACATC TACAAGGACT GACAGACAAC 3120 

RTCCAOCTTG TGAAAAGTAQ TCTAAGCACT CAAOACACCT TACTGATTAA AAAQGGTATG 318 D 

AAAGAGATOT CTCAGACACT ACAGGGAitSGA AATATOCTAQ AGGG2U3AIOCC CACACACTCC 3240 

AGAAGTTCTG AGftQTQAGGG CCAACAOAOC AAATCCavTCA CTTTGCCTGA CTCCACACTG 3300 

5 OGTATAATGA GCAGTATGTC TCCAGTTAAG AAGCCTGCGG AftA<3CRCAGT TGGTACCCTC 3360 

CTAGACAAAG ACAGCAC3Ui£ AGTAACAACA ACACCAAdSGC AAAAAOTTGC TCCGTCATOC 3420 

ACCATGASCA CTCAOCCTTC TCGAA6GAGA CCCAAOGOQA 6AAQGAGATT ACGOCCCAAC 3480 

AAATTCCXSCC ACCGOCACAA GCAAAC30CCA CCCACAACTT TTGCCCCATC AOAGACTTTT 3540 

TCTACTCAAC CAACTCAAGC ACCTGaCATT AAGAxrTCAA GTCAAGTGGA GftGTTCTTCTa 3fi00 

10 {^CCTACAG CTTtSGQTGGA TAACAC7VGTT AATACCCCGA AACAGTT06A AATGGAGAAG 3660 

AATGC7VGAAC CCACATCCAA GGQAACACCA CtSGAGAAAAC AOGGOAAGAa (3GC3UWCA&A. 3720 

CATCGATATA CCCCTTCTAC AGTGnSCTCft flGASC33TCCG GATGCAAGOC CAGOOCTTCT 37B0 

CCAGAAAATA AACATAGAAA CATTOTTACT CCCR(3TTCft<3 AAACTATACT TTTGCCTAGA. 3840 

ACTGITXCTC TQAAAACTGA GGGCCCTTAT GATTCCTTAQ ATTACAItiAC AACCACC3VGA 3900 

15 AAAATATATT CATCTTACCC TAAAQTCCAA GA6ACACTTC CA6TCACATA TAAACCCaCA 3960 

TCAGATGGAA AAGAAATTAA GGATGATGTT GCCACAAATG TTGACAAACA TAAAAGTGAC 4020 

ATTTTAGTCA CTGGTGAATC AATTACTAAT GCCATACCAA CTTCTCQCrC CTTGGTCTCC 40 BO 

ACTATGGGAG AATTTAAGGA AGAATCCTCT CCTOTAOGCT TTCCAGGAAC TCCAACCTGG 4140 

AATCCCTCAA GGAC3GGCCCA GCCTOGGAGG CTACA6ACAG ACATACCK3T TACCACTTCT 4200 

2Xj QGGOAAAATC TTACABACCC TCCGCTTCTT AAAGAGCTTQ AGOA^GrGGA TTTCACXTGC 4260 

GA6TTTTT6T OCTCTTTGAC AGTCTCCACA CCATTTCACC AGGAAGAAGC TGOTTCTTCC 432D 

ACAACTCTCT CAAGCATAAA AGTGGJUSGTG GCTTCAAGTC AQ9C3USAAAC CACCAOCCTT 4380 

GATCAAGATC ATCTTQAAAC CACTaTGGCT ATTCTOCTTT CTGAAACTAG ACCACAQAAT 4440 

C31CACX3CCTA CTGCrSCCGS GATGAAiGGAa CCAGCATDCT CGTCCCKATC CACAATrGTC 450O 

25 ATGTCTTTGS fSACAAACCAC €^C!ACrAAG CCAGCACTTC CCAOTCCAAfi AATA^CTCAA 4560 

GCATCTAGAS ATTCCAAGGA AAATGTTTIC TTGAl^TTATG TGGGGAATCC AQAAACAOAA 4620 

GCAAOCCCAQ TCAAiCAATGA 2\GGAACACAG CATATGTCAQ OGGCAAATGA ATT&TCAACA 4680 

CCCTCTTOOG ACCX3GGATGC ATTTAACTTG TCTACAAiySC TQtSAATTOGA AAAGCAAGTA 4740 

TTTOQTAGTA GGAGTCTACC AOSTGGOCCA QATAI30CAAC GCCAG6ATGG AAGAiSTTCAT 4800 

30 GCTTCTCATC AACTAACCAO AQTCCCTGCC AAACCCATCC TACCAAC2U9C AACAQXGAlSa 4860 

CTftOCXQAAA TGTCCACACA AAG06CTTOC AGATACITXG TAACTTOOCA GTCAOCT05T 4920 

CACTGGAOCft ACAAACOQGA AATAACTACA TATGCTTCrG GGOCITTGCC AGAGAACAAA 4d80 

CA0TTTACAA CTGC!AAGATT ATCAAGTACA ACAATTCCTC TCCCAa::CG<3V CATGTGC3VAA 5040 

OCCAGCATTC CTAGTAAGTT TACTOACXTGA AGAACTGRCC AATTCAATOQ TTACTC5C3iAA 5X00 

35 GTQTTTGOAA ATAACAACAT OOCTGAGGCA ACAAA£X:CiU3 TTOGAAAGCC TGOCAGTCCA 5160 

AGAATTCCTC ATTATTOCAA TGGAAOACTC CCTTTCTTTA GCAACAAQAC TCITTCTTTT 5220 

CCACAOrTQG GAGTCAOOOG GA6ACCCCA0 ATACCCACTT CZOCTGOCCC AGTAATGAGA 5280 

GAGA6AAAAG TTATTC3CAGO TTCCTACAAC AGGATACATT CCCATAGCAC CTTC(;!ATC7G 5340 

QACTTTGQCC CTCCQGCACC TCC3GTTGTTG CACACTCCQC AflACCAOSGG ATCACCCTCA 5400 

40 ACTAACTTAC AGAATATCCC TATOQTCTCT rOCACOCAGA GITCTATCTC CTTTAIftACA 5460 

TCTTCTGTCC AGTCClCRGG AAGCTTCCAC CAQAQCS^GCX CAAAjSTTCTT TGC7W3GAGGA 5520 

CCTOCIGCAT CCAAATTCTO OTCTCTTGGG GAAAAlSCCCC AAATCCTOVC CAAGTCCCCA 5580 

CAGACTtniGT CCGTCACOGC TQAGACAGAC ACTGTITrTCXZ CCTGXGAGGC AACAGGAAAA 5640 

. OCAAAGCCTT TOOTTACTTG CACAAAOQTT TCCACAGGftG CTCTTATGAC 1XXC3AATACC 5700 

45 AGOATWCAAC GGTTTGAOaT TCTCAAGAAC GGTACCTTAO TQATACGOAA OGTTCAAGTA S760 

CAAGATOGAG GOCAGTATAT GTGCACOGCX: AOCAACCTGC ACGGGCTGGA CRGGATGOTO 5B20 

OTCrraCTTT CXSGTCACQQT GCAGCAACCT CAAATCCTAG CCICCCACXA CCAGGAOQITC 5880 

ACTGTCTACC TGGGAGACAC CATTGCAATG GAjSlGlCIGG CCAAAOS6AC CXrCAGCCCCC 5940 

CAAATTTCCr GGATCTTCCC TGACA6GA05 GTGTGGCAAA CTOTOTCCCC OGIGGAGAGC 6000 

50 CX3CATCACCC TGCAC6AAAA OOGQACOCTT TCCSiTCAAOG AGGGGTCCTT CPCiS^jQACAaA 6060 

GGCGTCTATA AGIGOGTGGC CAGCAATGCA GOCGGQQCaG ACIUGCCIGGC CATCGGOCIG 6120 

CAC0XGC3OGa CACTGC0C30C CGTTATCCAC CAGGAGAAGC TGGAGAACAT CTCOCIGacC! 6180 

CCGGGGCTCA GCATTCftCAT TCACTGCSUirT OCSCRAOaCTO 06CCCCTGCC CaSOGTGOeC 6240 

TGG37QCTCG OQGAOSGTAC CCRfiATCX3GC CCCTCGCAGT TCCTCC3tf3B3 QAACTTQTTT 6300 

DD GTTTTGCCCA AGGGGTiOGCT CTACATOOQC AACCTOGCGC CCAAGGACAG CGGGOGCTAT 6360 

<3AGTQCX3TGG C(3GCCAACCr GGTAGGCTOC GGGOGCAGGA CB^TGCMSCS 6AAOGTGC!AS 6420 

GGXGGftGCAiS CCAAOSOGOG CATCAOGGGC ACCICCCCGC GQlVQGiRCGGA GGTCftQQTAC 6480 

GQAGGAACCC TCAAGCTGGA CTGCAQOSOC TOGGGGGAOC OCTGOCCGOQ CATCClCTTGG 6540 

A8GCXGCCXST CXAAGASGAT GATCGACX30(» CTCTTCAGrr TTGATAGCAG AATCAAGGTQ 6600 

OU TTTOCCAATQ GOAOCCTGGT GG7GAAATCA CTGAOGQACA AAGATGCCBG AGATTACdG 6660 

TG03TAGCTC GAAATAAGGT TaOTGATGAC TACGTGGTGC TGAAAGTGOA TGTOQTQATO 6720 

AAACCGGCCA AGATTGAACA CAAOGAGGAG AACQACCACA AAGXCXTCTA OGGGGGTGAC 6780 

CTGAAAOTQG ACtOIGTOGC CAa3GG6CTT CCCAATCCOG AGATCTOCTQ OASCCTCCCA 6840 

^ GACX3GGAQTC TGSTGMCTC CTTCATGCAQ TOGGATGAXIA GCGGTGGAOG CACCAAGOGC 6900 

OJ TATGTCGTCr TCAACAATGG GAfZACTCTAC TTTAAOGAAG TGGGGATGhG GGAGGAAGGA 6960 

GACXACACCT GCrSTGCTGA AAATCABQTC GGGAAGOAGG AGATGAGAGT CAGAGTCAAO 7020 

GTGGTGACAG C3GCC0GCCAC CATCCGGAAC AAGACTTACT TGGCGGTTCA G(3TGCCCCAX 7DB0 

OGAGAOGTGG TCACTG7AGC CTGTQAOGOC AAAGGAGAAC GCATGCCX:AA GGTGACTTGG 7140 

TTCTCX3CCftA OCAACAAGGT GATCCOCAOC TCCTCTGAGA ACTTATOUQAT ATACCAAGAT 7200 

70 GGCACTCTCC TTATTC3^GAA AGCCCAGGOT TCTGACAGCG GCAACTACAC CTGCCTGGTC 7260 

AGGAACAGGG GGGGAGAGOA TAGGAAGACG GT6TGGATTC ACGTCAAOQT CCAGCCACCC 7320 

AAiGATCAAGQ OTAACXXXZAA CCGCATCACC ACCX3TGCQQG AGRTAGCAfiC CXSGGGQCRGT 7380 

GGGAAACTGA TIGACIGCAA AGCTQAiVaGC ATOCCGACOC OGAGGGTGTT ATGQGCTTTT 7440 

CX:GGW3aC3IG TOGTTCTGCC AGCTCCATAC TATSQAAAJCC: GGATCACTGT CCMGGCAAC 7500 

75 GGTICCX:TG6 ACATCAGGAS TTTGAGQAAG AGGGACICOG TCCAGCTGGT ATGCATGGCA 7560 

OGCaiACXSAGG GRGGGGAGGC GAGGTTGATC GTGCAGCTCtA CPGTC5CIGGA GCOCATGGAG 7620 

AAAOOCATCT T0CAi3GACCC GATCAGCQAG AAGATCAOGG CGATGGCQGQ CCACACCATC 7660 

AGCCrCAACT GCTCTGCOGC GGGGACCC03 ACACCCafiOC TGGT6TGG6T CCTTCCCAAT 7740 

GGCACXZGATC TGCAGnOTQa ACAGC3^CTG CAGCXSCTTCT ACCACA&GGC TGAC3GGCATG 7800 

OU CTACACATXA QCXJGTCTCTC CTCGGTGGAC GCTGQGGCCT ACOGCTGCOT GGCCOGCAAT 7860 

GCCGCTGGCC ACftOOSAGAG GCTGGTCTCC CTGAAGGTGG GACTGAAGCC AGAAGCAAAC 7920 

AAGCASTATC ATATkCCTGGT CA33CATCAT£2 AATGCTGAGA OCCTGAAOCT CTOCTGCACC 7980 

CCTCCCGGGG ClGGGCAGGG AGGTTTCTCC TGGAGGCTCC OCUOGGCAT GCATCTGGA6 8040 

GQCCCGCAAA OCCTOOGACS OGTTTClCTT CrOOACAATG GCSUZCCTCAC GGTTCGTGAQ 8100 
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GCCTCGGTBT TTGACAGGGG lACClAXGlA TOCSUaOATGO AGACXSOAGTA CGGCCCITCG 8160 

GTCAOCAGCA TCCCCOTGAT TQTGATCGCC TATCCTCCCC GGATCACCAa CGAGCCCACC 8220 

CCGGTCATCrr ACACCCGGCC GGGGAAC^CC GTGAAACTQA ACTGCATGGC TATGGGGATT 82&0 

CCCRAAGCTG ACATCSWZOTg GOAGTTACCC GATAAGTCGC ATClGAflflGC AGGGGTTCAQ 8340 

QCTCGrClGT ATGGAAACA6 MTTCTTCaC CCCCACSGOAT CACTOACCAT CCAGCATGCC 8400 

AOUAGASAfi ATGCCGt3CTT CTACAAGTGC ATGGCAAAAA ACATTCTCCSG CAGTGACTCx: 8460 

AAAACAACTT AChTCCACGT CTFCTGAAAT GTGGATTCCA GAATGATTGC TTAGGAACTG 8S20 

ACAACAAAQC GG03TTTGTA AGGGAAGCCA GGTTGGGGAA TAGGftGCTCT TAAATAATOT 8580 

GTCACRGTGC AXGGTGGCCT CTQGTQGGTT TCAA(3TTGAG GTTQATCTTG ATCTACAATT 8640 

GTTGQGftAAA GGAAGCAAT6 CAGACAOGftG AAGGAGQGCT CA<KX;TTGCT GAQACACTTT 8700 

CrTTTG TgTT XACATCATOC CAGGGK3CTTC ATTC3U3G6IG TCTGTGCTCT GACIGCAATT 8760 

TTTCTTCTTT TGCAAATGCC ACTGSACtGC CTTCATAAGC OTCCATAGGA TATCTGAGGA 8820 

ACATTCATCA AAftATAAGCC ATAGACATGA ACAACACCTC ACTACCCCAT TGAAGACGCA 8880 

TCACCTAGTT AACCTGCTGC AGTTTTTACA TQATAOACTT TGTTCCftfiAT TOACSUVSTCA B940 

TCTTTCAGTT ATrTCCTCTG TCACTTCAAA ACTOCAGCTT GCCCAATAAG GATETAGAAC 9000 

CAGftGIGACi: GATATATATA TATATATTTT AATKZAGAQT TACATACATA CAQCTACXaT 9060 

TTTATATGRA AAAAGAAAAA CATTTCTTCC TGGAACTCAC TTnTATATA ATCTTTTATA 9120 

TATATATTTT TTCCTTTCAA ATCaCAOGAT GAGACTAGAA GGAGAAATAC TTTCEQTCTT 9180 

ATTAAAATTA ATAAATTATX QQTCTTTACA AGACTTGGAT ACATERCAQC AGACATG^^ 9240 

ATATAATTTT AAAAAATTTC TCTOCAROCT CCTTCAAATT CAGTCACCAC TOTTATATTA 9300 

CCTTCTCCRG GAACCCTCCA GTGGGGAAG6 CTGGGAXAXT AQATTTCCTT GTATGCSUU^ 9360 

TTTTltsIlGA AAGCTC3TGCT CA6AGGAGGT GAaAGGA<3AG GAA06AGAAA ACTGCATCAT 9420 

AACTTTACAG AATTGAATCT AQAGTCTTCC CCGAAAAGCC CAGAAACTTC TCTGCftGTAT 9480 

CT6GCTTGIC CMCTOGTCT AAGGTGGCTG CXXCXTCCCC AGCCATGAGT CAfiTTTGTGC 9540 

CCATGAATAA TACACGACCT {STTATTTCCA TGA CT G C TTT ACXOTATTTT TAAGGTCAKI 9600 

ATACTGrACA TTTGATAATA AAATAATATT CTCCCAAAAA AAAAA 9645 

Seq ID SIOs ciz I»IA Sequenjca 
Xfuclelc Acid Accessiozi AKOQ1903 
Codlns sequence I none 

1 11 21 31 41 SI 

I I I I I I 

3 J TATCAOXaCAT GTOOCSAAOGT GG6T61GGTG AGAAAAOTTT TAAGGCAAGA GlAGAraGCC €0 

ATQITCftACr IIAC3UUiATr TCTTOGAAAA CTQGCAGTAT tCTTGAACTQC ATCTTCTTTG 120 

GTACG6G21AC CTQGAQAAAC AGTGTGAdSAA ATTAAGTCCT GGTTCACTGC GCAGTAGCAA 180 

AGATOGTCRA GGCCATGGAA AAAGCAGAAA TTTAOCAAGA AAGCTOATAC CCATGTATAG 240 

TTOCCACTCA TCTCAAATAC ATCTGCTATC TTTTTAAQCT AAGTCCTAGA CRTATCQGGa 300 

4U ATAACATGGG G GTTG ATTAS TGACC^CAGT TATCAGAAGC AQAOAAATGr AATTCCATAT 360 

TTTATnGAA A CTTATTCCA TATTTTAA'XX GORTATTGAQ TGATTGGGTT ATCakAACACX? 420 

Cfl^ftAftCTTT AATmGTTA AATTTATATG GCITTGAAAT AGAAGTATAA GTTGCTACCA 480 

TTTTTTGATA ACATTGAAAG ATAGrATTlT ACCATCTTTTA ATCATCtTQG AAAATACAAG 540 

TrCCTGTOAAC AACCACTCrr TCACCTAGCA GCATGAGGCC AAAAOTAAAG GCTTTAAATT 600 

4 J ATftACA TATQ OOftXTCTTAG TAGTATQTTT TTTTCTTGAA ACTCAfiTGQC TCTATCTAAC 660 

CITACT ATCr CCXCACTCTT TCTCTAAGAC TAAACICIAG GCTCXTAAAA ATCTGCOCAC 720 

ACCAAXCTTA GAAGCTCTGA AAAXSAATTTS TCTTTAftRTA TGTTTTAATA GTAACATGTA 780 

rrTTTATGGAC CAAATTGACA TTTTCGACTA TTTTTTCCAA AAAAGTCftGG TGAATTTCAfi 840 

CACaCTGAST TOOOAATTrC TTATCCCAOA AGACCAACCA ATTTCATCATT TATTTAAGAT 900 

DU TQATTC2CATA CTOOGTTTTC AAGGAOAATC CCTGCftSTCT CCTTAAA6GT AGAACAAATA 960 

CTTTGTATTT TTTTTTrCAC CATXGXGGGA TIQSACTTTA AGAG6TGACT CTAAAAAAAC 1020 

AtSAGAACAAA TATGTCTCna TTBTATTAAG CAOGGACCXA. tATTATCATA TTCACTTAAA 1080 

AAAATGAITT CCTGTGGACX: TTTTGGCAAC TTCTCTTTTC AATGTAG6GA AAAACIXAOT 1140 

CAOOCTQAAA ACOCACAAAA TAAATAAAAC TTGTAGATGT GGGCAGAAGQ TTTGGGGGTG 1200 

GACATTSIAT GTGTTTAAAT TAAACGCTGT ATCaunHMA AGCIGTTGTA TOGGrCAGAG 1260 

AAAAXGAATQ Cr3»0AAaCr GTlCACKCCX TCAAGAGCAG AAGCAAACCA CATOTCTCAG 1320 

CTATATTKTT AtrrATTTTT TATG CATA RA GlfifiATCATT TCTTCTGTAT l^VATTTCKIRA 1380 

AGGGTTTTAC CCTCTATTTA AATGCTTTGA AAAACAGTGC ATXGACAA3X3 GGTTGATATT 1440 

TTTCTTTAAA AGAAAAATAT AATTATGAAA GOCAAfiATAA TCTQAAGCCT GTTirATTTT 1500 

OU AAAACTT TTT ATmTCI'GT G GTTGA TGTTG TTTQTTTGTT TGITTCIAXT ■ nWiW f rr 1560 

7XXACTTTGT TTTTT6TTTT (SCTTTGeTTTF GTTTTGCftXA CTAOViaCM TTCTTXaAOC 1620 

AATGTCTGTT ^?GGCXAATGT AATTAAAGTT GTTAATTTAT ATGAGTGCAT TTO^ACXATB 1680 

TCRA!CGGTTT CTTAATATTT ATTGTGIAQA AGTACTGGTA ATrTTTTTAT TTACAATA3X3 1740 

^ TTTAAAGAGA TAACAGTTTG ATAT13TTTTC A1GTQTTTAT AGCAGAAGTT ATTTATTTCT ISOQ 

Oj ATGGCATTCC AGCGOATATT TTGGTGTTIQ OGAGGCATGC AGTCAAXATT TTGTACRGTT 1860 
AQTaaACROT ATTCAGC3VAC OCCTQATAGC TXCITIGSCX: TTATGTTAAA TAAAAAGACC 
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T6TTTGGGAT &S 1932 

Seg ID MO: C13 Protein Sequence 
Nucleic Acid Accession #i Boa seijuence 
Coding sequences 1.^5001 



«C ^ 31 41 SI 

ATGCCAGGCA CAAAACTAAC GGGAACZ^C GGCCCASCnG ACTACAGAGX GhTATTOAAG 60 

AGCTCTCAAG AGGAOGAAXT OQATQTAOCT GAOGACATC3^ GGGTOOGGGT TATGTCATCT 120 

CAGTCTGTGC TTGTGTCCTG OGTGGATCCT GTTCTGGAAA AACAGAAGAA ASrTGTTOCA 180 

TCAAGACflOT ACACCGTGCG CTATCXSAGAG AAGQGGGRAT TGGCCAGGTB OGATTATAAS 240 

oU CAOATOGCTA ACAGGOGTGT GCTGATTGAG AACCTQATTC CAGACACIGT GTATGAATTT 300 

GCAOlCGeZA TTZCACAGGG TGAAAGACILT GGCAAATGSA GTAGGtCAGT CTTOSAAGA 360 

ACAGCAGftAX CTGQCOCTAC CACAGCTCCT GAAAACITQA AOSTCTOGCC AGTCSUaGQC 420 

AAACCXACZAQ TTOTOGCTGC ATC TTGG GAT QOGCTAOCAG AGACTGAQG6 GAAAQTGAAA 48D 
GTCTGTCTGC TOSACACAGG ACTGrTTTCA GTTTCCTCCT TGCAAOCATC TGCCAAATCA 
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TTTCAGAATA CATTCITTCA TACGCCCCGG CTCTCAAACC ATTTGGAaSCft AAGTCCCTCa^ 600 

CCTATCCTQQ AaaCACTACT TCTGCCCTOO TGGATGGTCT GCAGOCTGG6 QAACGCTATC 6€D 

TTTTCaAAAT CCGOGCCA.GA AACAGGTUJAG GCCTGGGPiCC TCaCTCCAAA GCCTTC3OTG 720 

_ TCGCTATGCJC AW^AAGAATG CAGCTGTACC CAGAAGGATT TC3«3TTGTCT AGCTTACCTG 760 

0 ATOGATATCC AAACCAAACS^ JUsTTJATAAA GATCCACAAC TGGAAGQOAG TOTTTTTGQA 840 

CCRTGTTTTC TTTTCTACTT CCTCACATTT ATGCTGGATA TTGGGGSCTT TTCCTTCATT 900 

ATOTGCTATG AAGACCCMilBr TGTTTCTTCr TT6ACRGGCA ATTCTTTAAA ATCTGTTOCA 960 

GCCAGTAAQG CGGATGTTCA GCAGAACAOB GAGGACAATG GQAAACCOSA AAAACCTGAG 1020 

^ CCTTCCTCAC trTTCTCCCRQ AGCTCCAGCT TCCTCOCAAC ACCCCTGTGT GCCTGCTTCT 10 90 

Ivl CCCCRAGGGA GAAATGCCAA GGACCTTCTT CTTGACTTQA A6AACAAAAT ATIOGCTAAT 1140 

GGTGGOGCGC CCCBRAAACC CCASCTTGGC GGGA&GAA06 CAGAGGIVGCr GGATCTTCftG 1200 

TCQRCRGAAA TCACtGGGSA GGMIGAGCTG GOTTCCOGGG AQGACTOGCC CATGTCACCX: 1260 

TCAGACivax: AAGACCAGAA AOOGAOCXnTG AGGCCX3CCAA GTAGACACGQ CCACTCGGTG 1320 

^ GTTGCTCCCG GCAQGACTGC AGTGASQGCC CGGRTGCCAG CGCTGOCCCG AAGGGARGQC 13 HO 

I J GTA6AT2iAGC CTQGCTTTTC CCTGGOCAOG CftfiOCCC3GCC CAGGGQCQCC COCCTCGGCT 1440 

TOSSCCTCTC CTGCCCMCA CGCOTCffiRCC CftGGGCACCT CTCATCGTGC TTCCCTGCCT 1500 

GOCAGCTTGA ATGACAAC^ CTTGGTGGAC TCAGftCGAAG ATOAQaaCQC TGTGG(3CTCC 1560 

CTCCRC3CCCA AGGGCXSCCTT CGCCCAGGCC 0(3eCX»GD0C TGTCCCOCAG COGCCAGTCC 1620 

CCGTCCflGCG TTCTCOGOQA CRGAflGCTCT GTGCACCCCG GCJGCAAAGCC AGCX:TOGCa3 1680 

QDGCGGftGGA OCXXXICATTC AOGGGCCJGCA GAGGAAGATT OCftCTGCCTC AGCCX:CACCC 1740 

TCAMGACTTT CTCXACOCCA T6GGGGATGA rCClGGOCtGC TGCCCACOCA GCCACACdG 1800 

AGCTCTOC^ TSTCCMUGGG GGGQ1UU3GAT GGTGAGGAG6 COCCAGCCAC CAACTCCAAT 1860 

GOGOCATCAC GGTCCACGRT GTCCTOCTOC GTCICTTCTC ATCTCTOBTC CAGGACGCAG 1920 

QTCTCTGAGG GAGCGGAQGC TTCTGATOGT GRAAGC3CACG CTGACQQC5GA TAGGGAASAC 1580 

GGOGGAAGGC AGGCSaaROQC CaCGGGCCAG AOGCTGCGQG CCCGGCXTTGC CTCTQQACAC 2040 

TTCCATTTGC TCaG3K»CAA AC3CCm?GCT GCCUUSGOQA GQTCTCCAftG CAGGTTCAGC 2100 

ATTGGGOSGG GfiCCtCBOCnr GCUaCCCXCC AGCTCCCCAC ASTCGACTGT GCX^CTCOGSA 2160 

GCCCACCCCA GaOTTOCCTC TCACTCTGAT TCCCACCOTA AGCTTAOCTC AGGTATCCAT 2220 

GQJMiAaSRGG AGGATGAGAA GCCKCTTCCT GCCACOGTTG TCAATGACCA CGTGCCTTCC 22 BO 

OX) TOCTCCAGGC AGCCCATCTC COGGGGCIGG GAGGACTTAA GGAGAAOCCC GCAGAGAGOG 2340 

GCXSkSGCIGC ATGGGAAGGA ACCCftTCCCA GAQAACCOCA AATCCACnSG GGChOAlACA 2400 

CATCCTCAG6 GCIffiGXACTC CTCXXTTCGOC TCCAAGGCTC AGGATGTTCA ACAGAGCI^CA 2460 

GACGCGGACA OGGAGGGTCA TTCTGCCAAA GCACAGCCAG GGTCCACAGA COGCCACGCX3 2520 

TCOOCTGCTC OTCXTTCCXXJC AGCROGGTCA CAGCftGCATC CCMTGTTCC CAGAAGGATG 2580 

ACAGCXX3GOC GG6COCCAGA AC2)GCAQCCC CCTCCTCCCK TOGCCAG6TC CCAGCACCAC 2640 

COGGGAGGOC AE3IU3CAQA0A OS0GG6TGG6 TCACClrTCGC AGCXSCAjGGCT CTCACTGftCC 2700 

CBQBacaaoc ggccccgccsc csucgtogcag aacoacrccx; actgctoctc ggjuocctterc 2750 

AOSGOGAGCrr CCRQAQGGAT GCTCCCC3U3G GCCCTCCAGA ACJCROGACQA GGATGCCCAG 2820 

. OQCaGCTAOG AGGACGACRG CACAGAAGTC GAGQCCCAGG ATGTGOGGGC CCXICXSOGCRC 2 0ao 

4\J GOCGCGOGCG CGAAOGAGaC AGCTGGGTCC CrTOCCRftaC ACCAQCAGGT GGAGTCTCCC 2940 

ACAQQGGCTG GGGGM3GXGG OGSnCOkCAGQ TOCCAGOGGG GACATGGGGC CTCCOCSCQOC 3000 

AGGCCCAGCC GAOCOKaCGO CCCXX»GTCC OSCQCSCaSGG TCCCCAGCAQ GGC36GGC0G 3060 

GGGAAGTOGG AQOCTCCTTC CAAGCGOGCX: CTGTOCTCCA A6TCCCAGCA GTCGGTCTCA 3120 

. GCOGAGGAOG ASOAOGAGGA GGAOGOQGGG TTTrrTAAAO GGGGGAAA0A AGACCTTCTG 3180 

4 J TCTTOCTCTG T60CAAAGTQ GOCCTCTTOC TCCACTCCCA GGGGGGGCftA ASACaoCOAT 3240 

GGGAGCCTOS CCaAGGMGA GAGSQAGCCT I3CCATGQCGC TTGCCCCTOS OOGASGGAGC 3300 

CTGGCTCCTG TGMVSCXSRCC TCTCCTGCCCA OCTCCAGGCA GCTCCCCCAS OGCCTCCCIkC 3360 

GTCCCTTCCC QACOGOOGCC TCGCAGCGCT GCCACCGTGA GCCSOOGTCaC GGGCACCCAC 3420 

Ct3CTGGCCGC GGTACACCftC QCQCGCCCCV CCTGGCXACT TCTCCACC&C CXJCXSATGCTO 3480 

JV TOCTTGOGCC AGAflOATGAT GCATGCCAGA TTCCGTAACC CTCTCTOC3CO ACAGOCTGOC 3540 

AGACGCTCTT ACAlGaUAAGS 7XAT3UVIGGC AOMXSUUKFG TAGMGGGAA AGXCCITCCT 3600 

GGTAGTAATG GAAAACOOllA TGGACAiQftGA ATIATCAAIG aCCCTCMOa AACAAAGTG6 3660 

GTTG^EOQACX: TTCATCXSrGG CrTAGTATTG AATGCAGAAG GAAGGTAOCT OCAAGAXTCA 3720 

^_ CRTGGAftATC CTCmXSGAT TAAACX7VGGA GGAGATGGTC GAAJ3GATTQT AQATCTGGAA 3780 

J J GGGAOCGOCO TGGtTQAGTOC TGACGGCCTC CCACTCTTTG GGCAGGGGOG ACMGGCAGA 3040 

CCTCTGGCCA ATGCCCAAiSa TAnacXMIT TTGAGTCTTG GAfiGAAllGGC GCTOaiaGGC 3900 

TTQGAGOTCA TCAAAAAAAC CRCCCATCC3C CCTfcOCACTA OC3KPGCAG0C CAOCaCChCT 39<0 

AoaacGcccc TGCcxAccau; tacaacccoj aggccx^cca crGcxaccAC catgcagckc 4020 

ACCZ^ACTA <3aACaCCCCT GCCTACCACT ACACCGfiGGC CCACCACTGC CAOCACOCGC 40 BO 

OU CQCACQACCA OCAGG06TCG AACAACCSVCA GTCC33AACCA CTACGGGGAC AIVCX2ACCACC 4140 

ACX3WOTCCA AACCCRCCRC TCOCaOJOCSCC AOCTGTOOCC CTGGOKSerT QOAAGGSCAC 4200 

G2lCX]^rFGATO GCAACCTGAT AATGft^SCTCC AATGGGATCC CAG7VGIGCTA GGCrOftAGAA 4260 

GATGRCTTCT CAGGCTTGGA GACTGACACT GCAGTAXXTCA COGftASftGGC CTAaBTTATA 4320 

TATGA1GAAG ATTATGAATT TGASACGTCA AOOCCACCAA CCADCACTGA GOCTTOGAOC 4380 

OJ ACTGCTAOCA CACOGfiGGBT GATCCCAGAG GAAQGOGCCA TCa^fiTTCCrT TGCTOBAGAA 4440 

GAATTXGATC TGGCTGGAAG GftAftOGATTr tfA ' itauT C LVl ' ACGTGACGTA CCIAAATAAA 4500 

GACCJCATCAG CCOOGTGCTC TCTOACTGAT GCACTCGATC AePTCCftftST eOftCMCXTO 4560 

6ATGAAATCA TCCCCAATGA OCTGAAGAAG ABTGATCTGC CTCCCCftGCA TGCTCOOOGC 4620 

TOAACATCAC06 TGG1GGCCX3T C3GAAQSTTGC CACTCATTTG TGRTITGTGGA TTGGGACAAR 4680 

GCCSLOOCCAG GA8ATTTGGT CACAjGGTTAT TTGGTTTACA GTGCATCCTA a:GAAGATriC 4740 

ATCftGGAACA AGTTTTCCRC TCMGCTXC3^ TCSUBTAACZC ACTTGCOCAT TOAQAACCIA 4800 

AAGCCCAACA CGftGGTATiA TTTTAAAGTG CAAGCACRAA ATOCICRTGS CTAOSGRCCr 4860 

ATCfiGCCCiT OGOTCTCATT TGTCACCGAA TCAGATAATC CTCTGCTTQT TGTGAGGOOC 4920 

— _ OCAQQCWCTG AGCTATCTGG ATCCCATTCG CTTTCAAACA TGATCCCAGC TACACGGRCT 4980 

GCCATGGACG GCRATATGTG AftSOGCAOST OGTATCORAA GTHSGTaGOA GTTOTTCTTT 5040 

GTAATTCACr GAGGTATAAA ATCTAOCTCA GTGAC»AGCT GAAAGATACA TXCXACAGCA 5100 

TTOGAGACAG CTGGGOAAGA GGTGAAfiZ\£C ATTGCXAATT TGlCGATTCA CACCTTGAT6 5160 

GAAGAACAGG GCCTCAGTCX; TAiaTAGAAG OCCTCIOCTAC TATTGAAOGC rACTATCGCC 5220 

AGTATOSTCA GGAGOCTOTC AGGTTTGOGA ACAlOGGCrr CGGRA£X:CCXZ TACTACTATC 5280 

OU TGGQCTGCTA GGAGTOTGGG GTCTCCATCC eEGOAAAOTG GTAATCACAG GAOCXaXCATG 5340 

CTGCAAGCTT GCCCTGCCCA 6CCXX3VCCRA CTAftGTOGCA CTAQOGGCre TGAOCAAAtA 5400 

CAGCCAGCSVr GCICAGCCCC GCTGCCCTAO GTG0C2AGGAA GGTCACAGAT GGACACIGGC 5460 

CATTCTGGTC ATCTCAGTCX QGAACTCAGT COCACTTCTT GGOCTGGACA AXGAACAGQA 5520 

TICRQTTTTG CTGTTAACXT TGCTTCTCIA CITTTTTTTG nrXGTTTGTA ATAGCACATC 5580 
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CCAGAGACAT CAGAAACCAQ CAACTGATTC AQT3TGATTT CCCAGRCITT TTAGGCRTGA 5640 

AATTOQGACA CTTCAGTATT TCCAGGAMA GCATATQCAC GCTGTrCTTO CTTCATGQ2VA 5700 

TGCTACAItJC TTTCTGTTTT TCTCATTTTG GATTTCTCCA AAACTAACXe AATTTAAGCT 5760 

TCftGGTCCCT TTGTATGGAG TAGAAAGfiRA TTATTAAAAA CAOCAOGftAA GAAAATAAAT S820 

D ATATCCTACT axSAAATTTAC TCTATGGRCT TACCXaCTGC TA6AATARAT GTATC3UVATC 5880 

TTATTIGTAA ATTCTCAATT TTGATRTATA TAOXJEATATA TGCATATACA TATCCftCACT S940 

TGTCTGCyVAG AATATTGATT AAAATTGCTA AATTTGTACT TGTTCACCaA AAAAAAAAAA 6000 

AAAAAAA 3007 

10 Seq ID NOs C14 DHA Sequencd 

Mticlelc Acid Accession fi^s NM_0Q3014 
Coding sequences 238., 1278 

J 11 21 31 41 51 

QGCGGGTTOG CGCCCCOAAG GCTGAGfAtSCT G<3CX3CTGCTC GTGCCCTOTa TGCC2W3ACOG 60 

CX3GAGCTCao CGGCCGGACC CCGCX3GCCCC QCTTTGCTac CSACTGGAGT TTOaOQGARG 120 

AAftCTCTCCT 6CGCCGCRGA AGATTTCTTC CTGGGOGARG GGACWSOQAA AGATGAGQGT 180 

GOCAGGAAGA QAAGGGGCTr TCTGTGTOCC GGGGTCGCAG GQOOAGAGGG CfttSTQCXATG 240 

TTCCTCTCCA TCCTBGTGGC GCTGTGCCTQ TGGCTQCRCC TGeCQCTCSGG CGTCCGCQQC 300 

GOSCCCTQCXJ KXXGGTGCG C3VTCCCTATC TGCCKGCaCA TGCCCTGGAA CATCACGCX» 360 

ATQCXXAACC ACCTGCACCA CAGCACGCAQ GAGAACGCCA TCCTGOCCAT CGAGC31GTAC 42 0 

GAGGAGCTOQ TGGACGTGAA CTGCAGOGCC GTGCTGCaCT TCTTCTTCTG lOCCATGTAC 480 

^ - GCaCX3CATTT GCACCCTOQA GTTCCTQCRC SACCCTATCA ASOOerSCiUL GTOGGTGlGC 540 

CRAa3C3GCGC GGaACCSACTO CXSAQCCXJCTC ATGAAGATCST ACAACCRCftS CrGQCCCQAA 600 

AGCCTOgcCT GCGACGAGCT C3CCTCTCTAT GACCX3T(3(30C3 TGTGCATTTC tacCTGAAGCC 660 

ATC<5TCACXS5 ACCTCCCGGA GGMXSTTAAQ TCGATAGACA TCACACCAGA CATGATCGTA 720 

CAGGAAAOac CTCTTGATGT TOACTOTAftA OSCCTAflflCC COGATOGGTG CftAQTOTAAA 780 

AAGGTGA»3C CAACTTTGGC AAGGXATCTC AOCAAAAACT ACABCZATQT TATTCATGCC 840 

AAAATAAAAQ CTGTGCAGAG GAQTGC3CTGC AATGAQGTCA CMJCGGfSGGT GGATQTAAAA 900 

QAOATCTTCA AGTCCTCATC ACXXaTCCCT OQAACTCAAG TCCO©CTCAT TACAAATXCT 960 

TCTTGCCaaT GTCX».CACAT CCTGCCCCAT CAAGATGTTC TCATCATCTG TTACQAGTGG 1020 

CQTTCRAGGA TGATQCTTCT TGAAAAITGC TTACJTTGAAA AATGGAGJU3A TCflfiCrrAOT 1080 

^_ AAAfiGATCCA TACRGTGGGA AGAQAgGCTC GAGGAACAaC GGAGAACAGX TCAGQACAAG 1140 

AAGAAAACAG CGGGGCaCAC CA6TCCTAGT AATCOCOOCA AACCAAAQGG AAAGCCTCCT 1200 

SCTCCCAAAC CAGOCAGTCC CRAOAAGAAC ATTAAAACTA GGAGTGCOCA GAAOAIJAACA 1260 

AAOCCGAAAA GAfiTGT<3A(3C TAACTAGTTT CCAAAGOGGA GACTTCOTAC TTOCTTACftQ 1320 

GATGAGGCTQ GGCATTCCCT GGQACAGCCT ATGTAAJ3GCC ATGTCCCCCT TGCOCTAACA 1380 

ACTC aCTGCA GTGCTCTTCA TAGACACATC TTOCftGCATT TTTCTTAAeG CTATGCTTCa 1440 

GTTTTTCTTT OTAAGCGAXC ACAAGCCATA GTGGllAGGTT TGCCCTTIGG TACSkSAAGGT ISOO 

GAGTTAAAGC TGGTQQAAAA GGCTTATTGC ATTCCATTCA GAGTAACCTG TGTGCATACT 1560 

CTAGAAGAQT AGCSGAAAAXA ATGCrTGTTA CAAITCGACC TAATATGTGC ATTGTAAAAT 1620 

AAATC3CCATA TTTCAAACAA AACAOGTAAT TTTTTTACAG TATCTTTTAT TAOCTrmaA 1680 

TATCEGTXQT TSCAATGTTA QTSATGTTTT AAAAlGTOAT GAAAAIKTAA TOTTTETAAG 1740 

AAGGAACAGT AGTGGAATGA ArTGTTAAAAa ATCTTTArrOr GTTTATGGTC HGCAGAAGGA 1800 

TTTt TOTGAT GAAAGGGGAT TTTTTGAAAA ATTAGAGAAG TAGCAlATOO AAAATTATAA 1860 
TaTOTTTTTT TACCAATGAC TOHZAGTTTCT GTTTTTAGCT AOAAACTTAA AAACAAAAAT • 1920 

AATA&aJAAAG AAAAATAAAT AAAAAGGAGA GGCAGACAAT CTCTGGA^TTC CTOTTTTTTG 1580 

GTTACCTGAT TMMGATC ATGATCCTTC TTOTCAACAC OCTCTTAASC AGCftCafiAA 2040 

ACAGTGAGTT TGTCTGTAC3C ATTAG6AGTT AGGTACTAAT TA6TTGGCTA ATOCTCRAGT 210 0 

ATTTTATACC CACRW3A3AG GTATGTCACT CATCTTACTT OCCAGQACAT CCACOCTQAG 2160 

AATAATTTGA CAAGCTTAAA AATOGOCTTC ATGTGAGTQC CAAATTTTGT TTTTCTTCAT 2220 

TTAAATATTT TCXTTOCCXA AATACAT6TQ AGAGGAGTTA AAlATAAAno TACAGAGAGG 2280 

AAAGTTGAGT TC3CACCTCTG AA Aagft OAAT TACTTSAlCAQ TrCQGRTACT •rl3iU«K»0AA 2340 

AAAAAfiAACT TATTMCBSC ATTTTATCAA CAAATTTCAT AATSGrfiQAC AATTSGASeC 2400 

AirEATTTTA AAAAACAATT TTAXTGQCCT TTTGCTAACA CAGTAAGCAT GrATTTTATA 2460 

AGGCATTCAA TAAATQCACA AOGCCCAAAS GAAATAAAAT CCTATCTAAT CCTACICTCX: 2520 

ACTACACAGA GOTAATCACT ATlAlffrATTT TGGCATATTA TS^CCROfST GTTTOCTTAT 2580 

GCACTIATAA AATGATTDSA AC3UIATAAAA Cl»0C3AACCr GTAXACATOT GTTTCATAAC 2 640 

CXGCCTCCTT TGCTTOSOCC TTTArXQAOA TAAGTTTTOC rCarCAAOAAA GC3M3AAACCa 2700 

TCTCATTICT AACaGCTGTG TTATATTCCA TAGTATOCAT TACTCRACAA ACTGTTGXGC 2760 

rATlGOATAC TlAGGTeOTT TCTTCACIGA CaATACIGAA TAAAfSkTCTC ACOGSAATTC 2820 
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Seq ID HQs CX5 TXfA Sequence 
Nucleic Acid AccesBlon #s im_005940 
Coding sequence: 23.. 1489 



1 11 21 31 41 51 

i ! I 1 I I 

AAOCCCAGCA GGCCXXaOGec QGATGGCTC3C GQOCGCCTOQ CICCGCAGOS OGGCCOaSCO 60 

CGOCCTCCTG CCCCOGATGC TGCTGCTGCT GCTCCftGCCQ OCGOOGCTGC TGQOCDGGGC 120 

TCTGCCGCCG GACGTCCacJC ACCTCCaTGC CGAGAGGAGG GGGCCACAGC OCrGGCATQC 180 

hGCCClGCCC AGTAGCXXXSG CACCTGCCCC TGCC3«X5Cfta GAAGCCGOGC GGCCTCOCAG 240 

CAGCCTCAGG CCTCCCCTCT GTGGOGTGCC OGACOCATCT GAlGGGdOA GTGOCXSGCAA 300 

/J GCQ ACAG AAG AGGTTOGTGC TTTCTGGCGG GGGCTGOQAG AAGAOGGACC TCACCTACA6 36Q 

GATOCTTGGG TTCGCATGGC AGTTGGTGCA GGAOCAGGTG CGGCAflACGA TCGCAGAGGC 420 

CCTAAAGGTA TGCaAGCGATG TGACGCXACT CACXTTTTACT QAQGTGCACG AfiQGOOGTGC 480 

TGACATCATG ATOGACTTO? CCAGGTAClG GCATGOGGAC GACCIQCOGT TTOATGGGCC 540 

„ tTGGGGGCATC CTQQCCXaVTG CCTTCITOCC CAAGACTCAC OBAQAAOGGG ASGTOtaCTT 600 

OV DGACTATGAT GAGACCTGSA CTATOGGGGA TGAOCftOBGC ACAGACCTOC TOCAGGTCGC 660 

AQCCCATGAA TTTG GCCAO G TGCTGaOQCT GCAGCACACA ACAOCIAGOCA AGGCCCTQAT 720 

OTCCQCCTTC JACACCTTtC GCTACCCACI' GAGTCTCAQC CCAGATQACT GCAGGGGCGT 780 

^CAACAOCTA TATGGCCaOC CCTQGCCCAC TGXCACCTCC AGOACCCCAG COClXSGGCXIC 840 

- CC3M3GCTGC3G ATAflACACCA ATOAGArEGC ACCGCTGGAG GCAGAGGCCC CGCXaSATGC 900 
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CTOTGAGGCC TCCTTTGAC3G OGGTCTCCAC CATCOJJtfSGC GAGCTCTTTT TCTTCAAJKSC 960 

GGGCTTTQTG TGGCGCCTCC GTGGGOaCCA GCTGCflGOCC GGCTACCCAG CATTGOCCTC 1020 

TCaCCACTGG CftGGGACTGC CCRGCCCTGT GGACQCTGCC TTCGAGGATG CCCAGQGCCR 1060 

_ CATrTGGTTC TTCCAAGGTa CTCftGTACTG GGTGTACGAC GGTGAARAGC CAGTCCTGGG 1140 

J C3CCCGCRCCC CTCACOOJVGC TOGGCCTGGT GRGGXTCCTO GTCCATGCTG CCTTGGTCTG 12 OO 

GGGTCCCSftG AAGAACAAOA TCXACTTCIT CCGAOGCftGG GACTACTGOC 6TTTCCACCC 1260 

CAOGACCCGG CGTGTAGACA GTCOCGTGCC COGCRGQGCX: ACTGACTGGA GAGGQfflOCC 1320 

CTCTOAQATC GACGCTGCCT TCCAGGATGC TGATOGCTAT BCCTACTTCC TGCGCGGCOG 1380 

1 n CCTCTACTOS AAGTTTGACC CTGTGAAGOT GAAGGCTCTG GAAGGCTTCX: CCCGTCTOGT 14*0 

lU C3G6TCCTGAC TTCTTTGGCT GTGOCX3AGCC TK3CCAACACT TTCCTCTGAC CATGGCTTOS 1500 

ATGCCCrCfiG GGGTGCTG?^ CCCTBO&aO CCAaSftATAT CAGGCTAQftO RCCCATGGCC 1560 

ATCTTTGTGG CTGTGGQCAC CAGSCATGOG ACTOMCCCR TGTCTCCTGC AQGQGSATCSG 1620 

GGTGOGGTAC AACCACCATG AC3UiCTGC0G GGRGGGCXyiC GCAGGTCQTG GTCACCTGCC IfiSO 

^ ^ AGOGACTGTC TCAGftCTBGG CAGGGAGGCT TTaOGRTGAC: TTAAGAGGAA GGGCAGTCTT 1740 

IJ GGGACOOQCT ATGC7W3GTCC TGaCAAACCT GGCT6CCCT6 TCTCATCCCT GTCCCTGRGG 1800 

C3TAGCACCAT GGCflGQACTG GGGGAACTQG AfiTQTOCTTG CTGTATCCCT GXTGTGAQGT I860 

TCCTTCCRGQ GGCTGGCACT GAAGCAAOGG TGCTGC3GGCC CCATGQCCTT CAGCCCTOSC 1320 

TGAQCAACTG GGCIGTAQGO CASGGOCACT TCCrtaROOTC AGGTCTTGGr AGGTOCCTGC 1980 

ATCTGTCIGC CTTCTGGCTG ACAATCCTOG AAATCTGTTC TCCRQaATCC AGGCCAAAAA 204D 

QTTCRCAGTC AAATGGGGAG QGGTATTCTT CAT0CR6GAG ACC0C3VQGCC CrOOAGGCTG 2100 

CAACRTAC!C7r CAATCCTGTC OCAOQCOGGA TCCTCCTOftA GCCCTTTXCG CflGCACTGCT 2160 

ATCCrCCARA GCCATXOTAA ATGTGTGiaC AGTGTOTATA AACCTTCTTC TTCTTTTTTI 2220 

TTTTTAAACr GAGGATTGTC ATTAAACACA GTTGTTTTCT 2260 
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Soq ID MOi C16 DNA Segusnce 
Nucleic Acid Accession UH_024022 
Gcsding sequence t 202..1S63 



ACX3GGGCACC GGACXSGCTCQ GGTACTTTCG TTCTTAATTA GGTCATGCCC GTGTeiW3CC3A 60 

GGAAAGGGCT GTQTTTATGG GAAGCCAGTA ACACTGIEGGC CTACTRTCTC TTCCGTOGTC 120 

CCarCTACftT TTTTGQQACT aSGGAATTAT GAGGTAGAGG TGGAGGOGGA GCC3QGATGTC IBO 

AGfiGGTCexo AA ATAG TCAC CATGGFGGGAA AATGATCCGC CrGCTQTTGA ASCCCCCTTC 240 

J J TCATTCCOAT CGCrTTTTGG 0CTTGAT6AT TTGAAAATAA GTCCTCTTGC ACCaiOATGCA 300 

GATGCTC51TQ CTGCACRGAT CCTSTGACTG CTGOCaTTGA AGTTTTTTCC AATCftTGGTC 360 

ATTGGGATCA TTGCATTGAT ATTAGCACTC GCCaTTOGTC TGGGCATCCA CTTOGACTGC 420 

TCfiOGGARGr ACA6ATGTOG CTCnTCCTTT ARGTGTATOG AGCMATAGC TOGATGTGAC 480 

eoaaTCTOGG attgcaaaqa oggggaqgag GnOTAccxacT gtgtccgggt GGGraarcAG 540 

AATGGCGIQC TCCftGGTGTT CACaaCTGCT TOGTGGRAGA CCATGTQCXC C3SATGAC1GG 6O0 

AflOQGTCACr AOGOUiATQT TGCCTGTGCC C3UU3JOGGTT TCOCattOCTA TGTQAGTTCA 660 

GATAACCICA OAOTGAGCTC GCXGOAOGGG CM5TTC0GG6 AGGAOTTTOT GTCCATCXSAtT 720 

CftCSCTCTTGC CAGATOACaA GGTGACTGCA TTACACCACT CftGTATATGr GAfiGQABGGA 780 

TGTGGCZCIG GOCACGTGGT TACCTTGCAG TGCACAfiCCT GTGQTCRTAG AAGGGGCTAC 840 
AGCTCAOGCA frOSTGGGTGS AAACATGTCC TTGCTCTOSC AGTGGCCCTO GCftGGCCAGC 



Seg ZD BOs C17 DNA seijaence 
Nucleic Add Accession lit ]bh_O03220 
Coding sequences 63.. 1376 



900 



— — — — — p« '^^'^^ V* ^-w-w 'tf^W'Aw^s^^^e.u' w^^'ujnjrvJ^nSw^ 7W 

CirCaOTTCC AGGGCTACCA CCTOTGOSGG GGCTCIQTCA TCAOGCCCCT GTGGATCATC 5GQ 



1020 



ACTOCTGCAC ACTQTGTTTA TGRCaTCGTAC CTCCOCRAGT CRIGGACCAT OCaGGTGGGT xu*u 

CTAGTTTOCC TGTTGGACAA TCCftOOCSCCA TOOCACTTX3G TOOAGRAGAT TGTCTAOCAC 1080 

AQCaACiaCA AfiCOlAAGAG GCTGGGOVAT GACATCGOCC TTATGAAGCP GSCTGGGCXa^ 1140 

JU CTCRCerrCA ATOAAATGAT CCRGCCTGTG TGOCTGCOCA ACTCTCMGA GAACTTCOOC 1200 

GATGGARRAG TGTSCTQOAC GTCAGGATGG GGQQCC3VCAG AGGATGGAGS IGACGCCTCC 1260 

CCTGTCCtt3A ACKftCGCQGC OSTCeCTTTG ATTTCCAACA AGATCTOCRA CCftCftGOGAC 1320 

GIGTAOGGTG GCATCATCTC CCCCTCCATG CTCTGCGOGG BCTACCTGAC GGGTGaQ(7rG 1300 

- « GftCAGCIGOC AGGGQGACAG OGGOCSGQCCC CTGGTGTGTC AAGAfiAGGAG GCT6TGGAAG 1440 

DD TruBTaaoRa cgaccrgctt tgqcatcggc tgcxscagrgg tqaacaagcc togggigtac 1500 

ACCCGTGTCA CCTCCTTCCT GGACTGGATC C3U3GAGCAGA TGGAGAOftGA CCTOAAAACC 1S60 

TGARGAOGAA GGGGJiCAAOT AGCCAOCTGA GTTCXOTftGG TOATGftAGAC AtSOCCXaTCC 1620 

TCCOCTQSAC TCCOSTOTAG GAACCTGCAC AOTRGCAGAC AOGCTTQQAO CTCTGAGTTC 16S0 

^-rx OGGCACC3U3T AGCTGGCOOG AAAGAGGCAC OCTTCCATCT GATTCCAGCA CAACCTTCAA 2.740 

OU GCTQCTTTTT GTTTTTTGTT TTTTTGftfiGT GGAOTCTCGC TCTGTTGC5CC AQGCTGGAST IBOO 

GCftGTGGGGA AATCCCTGCT ChCTG<MCC TCOGCTTCtX! IGGTTCAAGC GATTCTCTTG I860 

CCTCRQCTTC CCCAGXAGCT GGGACCACRG GXGCCCGCCA. CCACACC<»A CCA&TTTTTG 1920 

TATOTTTACT AfiftaikCfiGGG TTTCACCaTB TTGGC3CAGGC TCCTCTCAAA CX3CCTGACCT 1960 

^_ CRAATCATOT GCCTGCTK2A GCCXGCCACA GTGCTGGOAT TACRGGCATG GGCGACCACO 2040 

m CCTAGGCTCA OGCTCCTTTC TGRTCTTCaC TRAGRACAAA AGAAGCROCA ACTTGCAAGG 2100 

GOQGCCmC CCACTGGrCC ATCMaTTTT CTCTCCaGGa GTCTTGCRAA ATTGCTGRCQ 2160 

AQATAAGCftG TT&TQTGACC TCACGTGCAA AGOCRCCAAC AGCCACSTCRS AAAAGRiOGCA 2220 

CCRGCCCftGA RGTOCftGAAC TGCAOTCRCT GCAOGTTTTC ATCTCTRGGG AOCAGAACdA 2280 

nri JUiCCCaiCOCr TTCTACnTTCC AAfiACTTATT TTCACATSTG GGGftSGTTAA TCTAGGAATG 2340 

GGOCIRTTTT CATCATTTCT TTGTAOCATT TGGTGCTTGA OGTATTATTG 2400 

TCSCTTTGRTT OCRAWERATA TCTTTCCTTC CCICRRARAA ARAAAAAAAA AAAAAAAAAA 2460 
AAAAA 



246S 



\ 31 41 SI 

on I I I I 1 I 

OV GAATOTOCGSC TCTCTOGGTS RGRGACOGAG RGGGGCATAT CCGTTCRCGC CGAIECCATGA 60 
AAATGCTTTG GAAATIOAOG GATAAlATCA AGTACHRGGA CTGCGAGGAC CaarCACBACG 120 
GCACCAQCAA CGGGACGGCA C3QGTTGCCOC AGCTGGGCAC TGTAG6TCAA TCTCCCTACA 180 
CXSAGOGOCCC GCXX3CTGTCC CACACCCCCA ATGCOGACTT CCASCCCGCSk TRCTTCCOCC 240 
CAOCCTAOCA GOCTRTCTAC OCC3CAGTCGC AftGRTCCTTA CTCOCAOGTC ARCGAOCOCT 
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ACAGCCTGAA COCCCTQC;^ GCCCA6CCGC AQCCQCAGCA CCXIAGGCT06 CCOGGCCAGA 360 

GGCAGAGCCA GQAGTCrGGG CTCCTOCACA CGCACCGQGG GCTGCCTCAC CRGCTOTOOQ 420 

GCCTGGATOC TCGCAC303AC TACAGGCOOC AGCaAGGACCT CCTGCRCGGC CC2^CX30C3C 480 

TCAGCTCAiSG ACTCGGAGAC CTCTCGATOC ACTCCTTACC TCACGCCATC OAGGAGGTCC 540 

!> CX3CATGTAGA AGACXTCGQGT ATTAACATCC CAQATCAAAC TGTAATTAAG AAAOGCCCOG 600 

TGTCCXniGTC CAAQTCCAAC AQCAAT6C06 TCTCCQCCAT COCTATTAAC AAGGACAACC £60 

TCTTCSGCGG OSTOGIQAAC CXTCAAGBAAG TCTrCTGTTC AGTTCCOG6T CEGCTCTCtJC 720 

TCCTCftGCTC CACCTOGAAG TACAAGGTCA COGTGGGGGA AGT<3CftG0GG CGGCTCTC3M: 780 

CACCCGAGTG TCTCAA03CG TCGCTGCTGG GOQGAGTGCr C0GGAGGGC3 AAGTCTAAAA 840 

lU ATGGAGGAAG ATCTTTAAGA aAAAAA.CFGG ACAAAATAGG ATTAAATCIG CClGCtiGQaR. 900 

GAO&TAAAGC TQCCAAOGTT ftCGCTGCTCA CATCACIAGT AGAGQGAGAA GCTGTCCAOC 960 

TAGGCAGGGA CTITQOGTAC OTGTGCGAAA CCGAATTTCC TGCCAAAGCA GIAGCTOAAT 1020 

TTCTCAACCQ ACAACATTOC GATCCCAATG AGCAAGTGAC AAGAARAAAC ATGCTCCTGG 1080 

CTACAAAACA GATATGCAAA QAGTTCACCG ACCZTGCTGGC TCAGGRCCQA TCTCCCCTtJQ 1140 

GGAACTCACX3 GCCCAACCXX: ATCCTGGAGC CCGGCATCCA GAGCTGCTTG AOCCACTTCA 1200 

AOCTCATCTC CCAOGGCITC GGCA(dQCCCG CQGTGTGTGC OGCSGTCACG GCXSCTQCaOA 1260 

ACTATCTCAC CGAGGCXSCTC AAOGCCATG6 ACAAAATGTA CCTCAGCAAC AAOCCCAACA 1320 

GCCACAC3GGA CAACAACGCC AAA2W3CA6TG ACAAAGAGGA QAAfSCRCAGA AAGTOAGaCT 13 BO 

CTCCTCCCX3C OCOGCCCCTC CCACGCCTCA CCAGCCCCCC GCXSOSCCCAC CCTCOGGCGG 1440 

GTGACAGCTC 06GGATCAGC AACCCTTCCT GCTaCTOCTA CTGCTGCTGC TGCTQCX^GOC ISOO 

GCQQCOBCCSS COGCTGCCCT 7GG6TOCCOC CGAGTCICOG GGACTGOCCI CIGGACT6IC 156D 

AGT06G6CAG CCTCXCGGAC TCTQCACCCG CXTCCGAOCTC CCCACCCOCI CCCACACCGC 1620 

TGTGCOCCOG GAATTC 1636 

25 Geq ID 190 s Cla DHA Sequence 

Huclelc; Acid Accession #t HH_0D29B8 
coding sequences 71.. 340 

^ ^1 21 31 41 51 

30 I 1 1 I 11 

GCGGCACSaS AiSGAGTTGTG ACXETTCCAAG OQCCAGCICA ClClGA£!CAC TTCraxSCCT 60 

GCCCAGCATC ATOAAOOOCC TTGCAGCTGC CCrCCTTGTC CTGGTCTGCA OCATGGCCCT 120 

CTGCTCCTGT GOVCAAGrxG GTACCAACAA AGAGCTCTQC TGCCTCXTTCT ATACCTCCTG 180 

QCAGATTCCA CAAAAGTTCA TAGTTGACPA TTCTGAAAOC AGCCXXXAGT GCOCCAAGCC 240 

AdQGTGTCATC CTCCTAACCA AGAGAGGCGG GCSU3ATCTGT GCZGAiCCCCA ATAAOAAGTG 300 

GGTCC3U3AAA TACATCAGCSQ ACCTGAAGCT GAATGOCTGA GGGGOCTGGK AGCIGOGAQG 360 

GOCCAGTGAA CTT06TGGGC CCAGGAGGGA ACAGGAGCCT GAGCCRGGGC AATGGCCXSXJ «20 

CCACCCTGGA QGC3CACCTCT TCTAAGAGTC OCATCTGCTA TGOCCAGOCA CATTAACTAA 480 

CTTTAATCTT AGTXTATGCA XC3«!A3TTCA ITTTGAAATT GATTTCTATT aTTGRaCTGC 540 

ATTATGAAAT TAGTATTTTC TCTGACATCT CATGACATTG TCTTTAO^T OCITTCCCCT 600 

TTCCCTTCAA CICTTaGl:AC ATTCAA.1GCA IGGATCSU^TC AGXGTQAaRA GCITTCICAO 660 

Cft GftCA TTGT GCXATATOTA TCAAATGAOV AATCTTTATT GAATGGTTTT GCTCAGCACC 720 

ACCTTTTAAT AXAHTGGCAG TACimXTAT ATAAAABGTA AACCAGCATT CTCACTGTGA 7B0 

AAAAAAAAAA AAAAAAAAAA AAA 803 



35 



40 



45 
50 
55 
60 
65 
70 
75 
80 



Seq ID NDi CIS HEIA Saquance 
Kuclelc Acid Accession #< 1ilH_004063 
Ckxllzig sequence: 121.. 2619 

1 11 21 31 41 51 

I I I I I I 

^GGGMSCGIT CCCGGGGOAG ATACTOCAGT OGTAGCAAGA GTCTOGACCA CTGAATGGAA 60 

GAAAAGGACr TTTAAOCS^CC ATTTTGTGAC TTACASAAAG GAATTTaAAT AAAGAAAACT 12 Q 

AT6ATACTTC AQQOCCATCT TCACTGOCTa TGTCDTCTTA TGCTTTATTT GSCAAGTGGA 180 

TATGGCCAAG AGGGGAAeTT TAGTOGACCC CTGAAAOCCA TGACAfETITC TATTTATGAA 240 

OGCCAAGAAC GGASTCAAAX TATATTCOlS TTTAAGQCCA ATOC7CCXGC TGTGACmT 300 

QAAC3mACTG GGGAGACAGA CAACATATTT GIGATASAAC GGQnGGOACr TCTGTATZAC 360 

AACAGAGCCT TGGA£!AGGGA AAOUkQATCT ACTCACAATC TOCAGGTTOC AGOCCTGGAC 420 

GCTAATGGRA TTATAGTGGA GGGTCCACTC CCTATCACCA TAGAAGTOAA GGACATCRAC 4B0 

GAOWTGGAC CCAOSTITCT CC1U3TCAAAG TAC3GAAGGCT C3U3TAA1SGCA GAACTCIOGC 540 

OCAGGAAAGC OCTTCTTBTA TOTCAATGCC ACAGAOCTGG ATGAT0C9GGC CACTCCCAAX 600 

GGCC3U3CTTr ATTA0CA6AT TGTCATOCaUS CTTOCCATGA fTCftACAATlSl CAlXaXACTTT 660 

CAGATCZU^CA AQUUU^GGGG AQCCMCTCT CTTAOCOQAQ AGGGATCTCA GGAATTGAAT 720 

CCTQCTAAGA ATCCTTCCTA TAATCTGGTG ATCTCAGTGA AGGACSVTGGG AGGCCAGfliSr 730 

GAGAATTCCT TQVSXGATAC CAjC!ATCTGTa QATATCATAO TGACAGAOAA TATTT6GAAA 840 

GCACCAAAAC CTGTGGAGAT GGTOGAAAAC TCAACTGATC CXChCCCCtCi CAAAAaXACF 900 

CAG6TGC6ST GGAATGATCC OSGrGCACAA TATTCCITAG TTGACAAAGA GAAGCT6CXA 960 

AGATTCCXaT TTTCAATTGA OCAGGAAGGA GATATTTACG TGACICAGCC CTTGGRCCGA 103 0 

GAAGAAAAGG ATQCATATGT TTTTTATGCA GTOKMAOQ ATGAtrabCGG AAAAOCACTT 1080 

TCATATCOGC TGGAAATTCA TGTAAAAGTT AAAGATAITA ATQATAAITOC ACXTACATGT 1140 

CCGTOJkCCJUS TAACCXJTATT TOAGGTCCAG GAGAATGAAC GACIGGGXftA CRGTAXG8G6 1200 

ACCX7TTACTG CACATGACAG GGAXCUUUaAA AAIACTGCCA ACAGTTTTCT AAACTACAGG 1260 

ATTGTGQRGC AAACTCCGAA ACTTOCJCATG GATGGACTCT TCCTAATCCA AACCTATGCT 1320 

GGAATGTTAC AGTTAGCTAA ACAGTCCTTG AAGAAGCAAG ATACTCCTCA QTAGAACTTA 1380 

AOGATAQAGG TGTCTBACAA AGATTTCAAG AOC3CTTTGTT TTGTGCAAAT CAAOSTTATX 1440 

GATATCAATG A1CAGATCCC CATCTTTGAA AAATQ\GATT ATGQAAAGCT GACTCTT8CT 1500 

GAAGACACAA ACSVrTGGGTC CACCA3CTTA AOCATGCAGG OCAGTGATGC TGKTGAGGCA 1560 

TTTACTOQGA GTTCTAAAAT TCTGTATCavr AXCATAAAGG GAGACAQTQA BGGACGCCTQ 1620 

GOGGTXGACA CAGATCCOCA TAOCR7iCaw:C OaATATOTCA TAATTAAAAA GOCTCTTGAI 1680 

TTTGftAACAG CAQCTQTTTC CAACATlGTG TTCAAAGCAG AAAATCCTGA GOCXCTAISTO 1740 

TTTGGTGTGA AGTACAATQC AAGTTCTTTT GCTAAGTTCA CGCTTATTGT GACAGATOTG IB 00 

AATGAAGCAC CTCAATTITC CCAACACGTA TTCCAAGOGA AAGrCAGTGA GGATGTAGCr IB 60 

ATAGGCACTA AAGTGGGCAA TGTGACTOOC AIUIGA'TCCAG AAOGTCSGOA CATAASCEAT 1920 

TCACTQAGGG GAGACACAAO AGGTTGGCTT AAAATTOACC AGGTGACIGG TGAGATCTTT 1980 

1209 



wo 03/042661 



PCT/US02/36810 



AGTGTOGCTC CATTGGACAG «5A»QC3CaQA MTCCATATC GCSGTACAftfiT G6TGGCCACR 2040 

C5AA6TAGGGG GGTCTTCGTT GAGCICTGTG TCRGRGTTCC ACCTCATOT TATGCSATGTG 3100 

AAT6ACAACC CTCCCAGGCT JVGCCRAOQAC TACACGGGCT TGTTCTTCTG CCATCCCCTC 2160 

AGTOCACCTO GAA6TCTCAT TTTCGAGGCT ACTGATGATG ATCAGCACTT ATTTCGOGGT 2220 

D CCCCATTTTA CATTTTCccr COaCACSTGGA AGCTTACAAA ACGACTG6GA AGTTTCCAAA 2280 

ATCAATG0TA CTCATQCCCG ACTGXCXACC AGGCRChChG AGTTTGAGOA GAGGGRSTAT 2340 

GTOGTCTTGA TCXX3CATCRA TOATOOGGGT CQOOCACCCT TGGRAGGCAT TGTrTCTTTA 24O0 

CCAGTTACAT TCTGCAGTTG TGTGOAAGQA AGTTOTTTCC CGOCAGCAGG TCACCAaACT 2460 

GGGATACCCA CTGTGGGCAT GGCAGTTG6T ATACTGCTGA CCACCCTTCT GGTGATTOGT 2520 

l\J ATAATTTTAG CAOTTGTGTT TATCEaCATA AAOAAGGATA AAGGCAAAGA TAATOTTCAA 2580 

AGTaCTCAAG CA1CTGAAC3T CAAACCTCTG AGAAGCIGAA TTTGAAAAGO AATGTTTGAA 2640 

TTTATATAGC AAGTGCTATT TCflGCAACaA CCRTCTCATC CTATTACTTT TCATCTAAOG 2700 

TGCATTATAA TTTTTT AAAC AGATATTCCC TCrTffrC3CTT TAATATTTGC TAAATATTTC 2760 

TTTTTTGiUSG TGGAGTCTTG CTCrGTOQCC CW3GCTGGAG TACA6TGGT6 TOATOCCAGC 2820 

1 J TCACTGCAAC CTCCOCCTCX: TGGGTTCACA a^OrTCTOCT GCCTCAaCTT CCTAAGTAfiC 2S80 

TGGGTTTACA GGCACCCAOC ACC&TCECCa GCTAATTTTT GTATTTTTAA TAGASAOSGG 2940 

OTTTCGCCAT TTGGCCAGQC TOOTCTTGAA CTOC-raACQT CAAOTQATCT GCCTGOCTTG 3D0O 

GTCrCCCRar ACAOTCATGA ACCACTGCAC CCACCTACXT AGATATTTCA TQMCTATAG 3060 

^^"^^^ gftgft GArtXTTCAT TTTTCCATCA CAITTTTCCT CTCTGCAAAT GGCTTAGCTA 3120 

CTTGXQmT TCCXTTTTTOG GGCAASACaG ACTCATTAAA TATTCTGTAC ATTTTTTCTT 3180 

TATCAAGGAG ATATATCAGT GTTSTCTCAT AGAACTGCCT GQATTCCATT TATGrTTTTT 3240 

CTGATTCCAT CCTGTGTCCC CTTCATC3CTT GRCTCCTTTG GTATTTCACr BAATTTCAAA 3300 

aVTTTGTCAG AGAAGAAAAA CX5TGAGGACT C3W3G2AAAAT AAATAAATAA AAGAACAGCC 3360 

TTTTCOCTTA QTATTAACfiG AAATGTTTCT GTOTCATTAA OCATCTTTAA TCAATOTQAC 3420 

-i-I ATOTTGCTCT TTGGCTGAAA TTCTTCAACT TGGAAAIGAC ACaOACCCAC AGAAGGTCTT 3480 

CftAA^^AC CTACTCTGCA AACCIXGGTA AJkOGAACCAG TCAGCTOGCC AGATTTCCTC 3S40 

ACTACCTGCC ATGCATftCAT QCTQCaCATO TTTTCTTCAT TCGTATQTTA OTAAAGTTTT 3600 

GGTTATTATA TATTTAACAT OTGGAftiGaAA ACftAfiACara AaAABftGTGG TGACRAATCA 3660 

2Q AGAATftAACA CTGGTTSTAG TCMSTTTTOr TTGTTAA 3697 



20 



35 



45 



Seij XJi NO: C20 SJSsOl Sequence 
Rucleic Acid Accession #s 1IK_004443 
OodlTig sequences 28.. 3024 



1 11 2X 31 41 51 

^ 1 I I I I 

OBCrCGGCTC CTftGAGCTQC CAGGGCGATG GCCAGftGCOC QCCOGCCGCC GCS3GC06T06 60 

<X:GCOGOCQO GGCTTCTGCC GCTGCTCCCT CCGCTGCTGC TGCTGCCGCT GCTGCTCCXG 120 

W CCC3acOGGCr GCCGGGCQCT GGAAGAGAOC CtCAXGQACA CAAAATGGGT AACATCXGAG 180 

TTSSOffTGQA CATCTCATOC AGAAAQlGGa TGQCAAGAGG TGftQlGGCTA OSATQASGOC 240 

A1<3AAT00CA TCCGC3iCATA CCAGGTCTGr AAlQTGCOCa AGTC3UU3CCA GAACftAClGG 300 

CTTCGCaCX3G GGTTCATCTG GCeeOQGOAT OTGCAGOSGG TCTACGa^QA GCTCAAGTTC 360 

ACOXSTGOGTG ACTGCAACAG CATCCOCAAC AICCCCXSQCT CCTGCAAGGA GACCTTCAAC 420 

CXCTTCTACT AOGAGGCXGA CaGCJOATGUG GOCTCftGOCT CCTCCCCXrrX CTGGATQGAG 480 

^CggCTACG TSAAAOTOGA CMCRTTGCai CCXaSATOAGA GCTTCTCSCG GCTOQATQCC 540 

GGCaSTQTCA ACAOCAAGGT QOaCAGCTTT QGGCCACTTT CCaAflOCTOG CTTCTACCTG 600 

GCXTTOCftGG ACCAGQCacCC CSSCKTGrCG CTCRTCTCCG TC0GC3GCCTT CTACAAGAM 660 

„ TGTGCftlCxa^ CCRCOGCAGG CTTCQCACTC TTCCCOGAGA CCCTCACIGQ GGCTGAGCCC 720 

ACJCrCGCTGG TCAtTTGCTCC TGGCACCTGC ATCCCTAAC3G COSTGGAGGT GTGGG1X5CCA 780 

CTCAASCTCT ACTGCftAOGG CGAXaOGQIU} TGGATGGTGC ClGieSSlOC CTGCAOCTGT 840 

GCCACOaGCC ATGftfiCCAGC TGOCARSGAG TCGCAGPIGCK OCCCCTCTCC OOCTQGGAGC 900 

TACAAGGOSA AGCAGGGRGA GGGGCCCTQC CTCCCATGTC CCCCC3«.CftG COffilACCACC 360 

- _ Tcamecjco ccrgcatctg cacctgccac aataacttct accgtgcaga <toggactct 1020 

OCQC^CRGTG CCTQTACCAC CXSTGCCaTCT CxaDCXSOUftG GTGTGATCTG CAATGTOAAT 1080 

GAAACCTCAC TGATCCTOSA GTGQaOTSAG GCCCXSGGACG TGGaTGOOCXa GSATGACCTC 1140 

CTOTACaUkTG TCATCTGCAA GAnfiTQCCRT GSGGCTOGAS GGGCCTCAGC CTSCTCAOGC 1200 

TGrOftTOACA AOGTGGftCSrr TGTGOCTCGG CAGCTGGSCC TGACOGAGOa CCGG6TCCAC 1260 

ATCaVGOCKPC TGCTGGCCCA CROGOGCTAC AOCTTTOAGG TGCAGGCGGT CAAC3GGTGTC 1320 

W TOGGeCaAOA GCCCTCTGCC GCCTCOTTAT G0GGC06TGA ATATCACXMC AAACCftOGCT 1380 

GCCCCGTCTG AMTGCCCRC ACTACGOCTG C3tfaGC3M3Cr CftGTOAGCAG CCTCAOCXTTA 1440 

TOCIGGGCAC GC3CCAGAG06 6CCCAA£3QaA QTCATCCTG6 ACTAOBAGAr OAAaTACTTT 1500 

GA GAftGagQ G AGGGCATCQC C3CCACAGTG ACCftGCCRGA lOAACTCOGr GCRGCTGGAC 1560 

QSGGPTDGOC CTGAOOOOOG CTATGTGGIC CAiSOTCCGTG CCCX5CACAGT AGCTOGCTAT 1620 

GGGCa^QTACA GOCGCCXTOC CQAOTTTGAS ACCACRASTG AiSnfiAQSCTC TGGGGOOCAG 1680 

CAGCTCCAGG AGCflGCTTCC OTCATGGTG QGCTCOBCIA CftGCTOGGCT TCTCTTOflTO 1740 

GIGGCIGTCQ TGGTCATCSC TATOSTCTGC CTCftSGMflC AfiOSACRCGS CTCTGATTOCS 1600 

OASXACACGG AGRAGCTOCA GCRGrAGATT GCTCGTOGAA TOAAGGTrTA TATTTSACCCT I860 

TTXAGCXAC38 AGGftOCCTAA TGAQQCTaxr OGGGAGTTTG CGAAGQAGAT CX3AOGIGTOC 1920 

/U TGCSTCARGA TOSABOAGGT GATCSGGflGCT OQaOAATTTG GGGflAGlGTa CCQTGGTOGA 1980 

CTOAAACAGC CTGOOCGCOG AGftJGGTSTTT GTGGCCAMl AGAOBCIGAA GSTOGGOiaC 2040 

AOraAGAGGC AGCGGGQGGA CTTCCTAAflC GAGGCCTOCA TCATGGGXCA QTTTOATCAC 2100 

OCCAATATAA TCXX3GCTCQA GGGCGTGGTC ACCRAAAfiTC aacX3WjrTAT GATCCTCACT 2160 

^ - GAGTTCaXGG AAAACTGOGC OCTGSACrCC TTCCTCCGGC TCAAOGAMG QC3M3TTCACG 2220 

/J GTCRTCCAQC OXSGTQGGCAT GTTGOGGGGC POTGClQClCa GCATOAAOTA OCTGTOCGftfi 2280 

ATGAACTAT6 MCACCOOGA OCrGGCTGCT OgCftACATCC TTCTCAACM CAACCTGGTC 2340 

T?QCAAAGTcr causAcrrraa octctcocgc ttoctggbgq atgaccoctc cgatcceacc 240o 

TACACCAGTT CCCTOGGOGG GAAGATCCOC ATCCOCrtSGA CTGCCCCAGA GGCCATAGCC 2460 

TAMaORAQT TCRCTTCTGC TASTQKK3TC TGGAGCTACG QAATTOTCAT GTGGGAGGTC 2520 

ATGAGCTATG G?M3ftG0GACX: CTACTGGGAC ATGAGCRACC AGGATGTCAT CAATGCCGTa 2580 

GAGCAGGATT ACTOGCTGOC ACCAOCCATQ GACTGTCOCA CAGCRClGCA CCAGCTCATG 2640 

CTGGACTGCT GGGTGCGGOA CCGQAACCTC AGGOCJCftAAT TCXOTCAGAT TGTCMaHCC 2700 

C TGGAC ftflGC TCATOCGCAA TGCTGOCaGC CTCAAGGICA TTGCCAGCGC TCAQTCTGGC 2760 

ATOTCRCAGC OCCTCXrCGGft CCJGChCGOTC OCftGATTAia^ CAAOCTTCAC OACftGTTCGT 2820 

1210 



wo 03/042661 



PCTAJS02/36810 



5 
10 
15 



GATTGGCTGG 
GC&TCTTTIG 
CTGGCCOaCC 
CftGAOSCTGC 
GATGCCSUiaC 

TCCCTCCCCA 
TGATGACCCC 
CCACAAOCTC 
TAAGCCXSGGQ 
AGCAGTCOCT 
CCATCCTGRA. 
CCCTGTCOCC 

GCnrOGSTGA. 



A-TGCCRTCAA 
ACCTGGTGQC 
ACCA6AAGAA 
CTGTQCAQGT 
AGCCGGCT66 
AAGCTGOAAG 
GGAAGTG06C 
TCCCCAAGCC 
ACACTTGTCT 
TTCCACAQG6 
CCCTCRGOAA 
□CCA.GCITGC 
ACCCCOOCCC 
TGCCCCCCAG 
TGTGCGCOCa 
QOGTGTAAAA 
A66CAATAAG 



GATGGQ6CGG 
CCAGATG3^G6 
GATCCIGRGC 
CTGACACaSG 
ACTTTCX3GAC 
TTTQG{3AAAQ 
CCCAAACCTC 
CCTCAGGGCC 
GTTCTTC3W5T 
CXX3U3CCCTa 
CXGOAGGAGG 
ACCTCCAGTT 
TTGGTGCTGr 
AGACTGACrC 
a3COOt3C3ST6 

ATOAA 



GCAGMGACC 
A6TATCCAGG 
CTCCCACGQG 
TCTTOGACTT 
GGCCAAGCTG 
^TCATATTQA 
CRGACCTTCC 
GCTGGAGGTC 
GCAjGGGGTCT 
GGACTCCAGG 
TGCACAQGGA 
CATAAAAGGG 
TCBGAOCCftG 
TGTCTGTGQ^ 
OTQCOCTACA 



GCTTGG7GAG 
TGCXroOTAT 
ACATGCGGCT 
GACCCTGAOG 
TTGGATGCCr 
GGACTTCTCC 
AGATG6ATTA 
TGCTCTCCAG 
CrGGCA£3GGT 
GGCCCCCCAG 
AATOGGGAAA 
TTTGTOCTGG 
CAQGCAGGOG 
A6ATGGGATG 
CDQCACTGGCC 
AT0GGGCC3U} 



TGCXSGGGTTT 
TGGGGTCACC 
GCAGATGAAC 
ACCGTGCAQG 
GGCCTTAGGC 
AGGCCIQTQT 
GOAGAGOGGG 
CAGGQQATCC 

gTAGGCQGAG 

tgtgacacca 

GGGCTGAGGG 
CAGGCTGAGO 
TGTGAGTGTG 
TGCACAGAQA 
CTGGSOC G AC 



28fi0 
2940 
3000 
30 60 
3120 

3iao 

3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
37B0 
3805 
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Se<3 ID NO I C21 DNA Sequence 
Nucleic Acid AcceBBion «s 1IM_001804 
Coding sequences 82.. 8 79 



1 

i 

AGGT6AGCX3G 
GCGOQGGACC 
TAOCCCX3GCC 
GCXKQGCSZCC 
□CCCCCOOSC 
GCCGCCTACS 

cccxxrrccRG 

CavGCCCCTOQ 
oaGTGGATGC 
AAQGTUZAAGT 
CATTACRGOC 
ACTGAAOOQC 
AAQAAGAAAC 
AOCCCAGCCQ 
TOCTCTCCAA 
OGGGGGACCT 
caSAOTCTCA 
OGCCTIGGCC 
TAOAOCTCTG 
AAGCCCX^AAA 
TACSTTCAGA 
ATGGAGCTOA 
ARCTCRCACC 
GCOCmCCCA 
CRCTOQGTTA 
GGAGGCCCTS 
C3«a3CTGCX3C 
OVCAGCTGGA 



11 
I 

TTGCTCGTCG 
C060GGCCAC 
CAGCCfiGGCC 
CX3GCX3CCCCC 
CCCOQAGGGC 
GCC0GC3GCCC 
AGXTTAGCCC 
GGGGGOGOOG 
GGCQCAGCC3T 
ACCX3CX5TGGT 
GTIACATCAC 
AGGTGAAGAT 
AGCAGCAOCA 
GOCOWCCCT 
TGCCIGTGAA 
0GGGACT06G 
GCCCTQACCT 
C3ftTrOTGTGC 
GGGGATAAGO 
TGGTTGGGGG 
GGTGCAQCTG 
AAAAG&TGSA 
TGCCTCTTCT 
GGCT6GATAC 
CAGAATCftCA 
GGAC9GAGAGA 
ATCASGGCCC 
CAOC3TGTTGT 
CTTTTCftAA 



21 
I 

T0GQG8GGGG 
CATSTATOTQ 
AGCCftGCCTC 
GCAGTACCCC 
CTGGGGGGCG 
OSGeGCCCCT 
OaTGOCGGGG 
CAC3«X!GTCC 
GGCGGCCGGA 
CTACACaiAC 
AATC0GGCG6 
CTGGTTCCAA 
ACW3CCCCCA 
GGGGGGOCrrO 
AGA0GA6TTT 
5TGCn3G(3AC» 
TCTGOCbACAT 

GAGTCCAGOG 

GAGGCXTTGXG 
AXQCTTGCaG 
GCAGOCTCAC 
ITAASCACAAA 
GCCC?rCATGA 
AOGGGGCAGA 
AGGGAGCOOC 
AlATAGAGTe 



31 

I 

CQaC»0GGGC 
GGCKAT6TGC 
GGCKTGGQCC 
QACTTCTCCA 
CCCITCCCTQ 
GCX33CCAGCC 

TOHC3COGGAG 
OGOGGCGGTQ 
CACCAACGCC 
AAATCUSAGC 
AAGCX3GGGGG 
CAGCCGCCXSfl. 
TOTCCCAGCA 
CroCCATAGC 
TOTGGCTCCT 



TTGGATAAAG 
TOGATGATCT 
ACAASQCrCC 
TGGGGACCAC 
AGCKTGAOCr 
CTCTACCTGC 
GCCCATAGCA 
TCATTCTCA0 
GTCTTCCCTA 
GAGAGGACrr 
GAATCTCiTCj 



41 
1 

GGCTCCAGGG 
IGGACAAgOA 
CXWCAAACTA 
GCTACTCTCA 
OC3C3CX3U)GGA 

CAGcrrcxscT 

OC3GGOCCGGG 
CGCAGAGGCC 
GC3WX3C3GTAA 
TGGAGCIGGA 
TGGCrGCSCAA 
CAAAGGAGtSS 
TQQCCCAOGA 
ACACCAQCCT 
CCX3VT60CCA 
GTGGQCCCSIS 
CACCTATOCA 
AOCITCCAGC 
CAATCTCC53G 
AfiOCCCCACC 
ACTGATCCIG 



&eq ID NQ^ C22 DNA SeQUence 
fmclelc Acid Acc^slon #« im_021978 
Coding sequence » 36.. 2603 " 



1 
) 

GACGOCXGM 
GOGQAGGGGG 

AGCATGOOOC 
TGCTGQGGAT 
TCnCCAATGO 
ACTGCSICTGA 



GCAGOBTCAT 

OCOAGOGCJST 
AGTCCTTTOT 
OCCAGGACAA 



CGGGGGAOGC 
ACBAGOGOGG 
CCCTGCSTGCA 
AGAAOSXOCT 
CCACCTTCn 
□GACATTCAA 
ACATTGAGGT 
AGCOOSGOGT 
ACTGCGGAGA 
TCCACXCAGA 



11 
I 

AGAtXX:GC3GA 
CCOSAAOGAC 
GGAGQAAGGC 
CSGGGOGCTGG 
CGGCTTCCTG 
CTACATGA18& 
GTTTGTAAGC 
ATTCCXGGGC 
OQCCTACTAC 
CATGOCOSAO 
GGTCACCrCA 
CAGCTGCAOC 
CITCCCTGAC 
OGACTCAQTG 
CAQCC3ACCTG 
QTTGTGTGQC 
GCTCATCACA 
CfSVGCIGCCr 
CA60CCCTAC 
GCCCSiAJCAAC 
GCCIGCOGGC 
OAGSTCCCAG 
TCAGTCCTAC 



21 
1 

GCGGCCTOGG 
TTOGGOGCQG 
GTOGAGTTCC 
GiGGiacTog 
GTGTGGCATT 
ATCACAAATG 
CTGGCCRGCA 
CCCTACCACA 
XGGTCTGAGT 
GAGOGOGTAG 
GTGGTGGCTT 
TTTGGCCTGC 
AGCCCCTACC 
CTGAGOCTCA 
CSTGaCGGTOr 
ACCTACCCTC 
CFGATAACCA 
AGGATGAGCA 
TACCCAGGCC 
CRQCaTGTGA 
ACCXGCXXJCA 
rETGOTOGXCA 
ACOQACACCG 



31 
I 



GftiCTCAAGZA 
TGCCAGTCAA 
CAGCCGTGCT 
TQCAGTACCG 

AGQTGAAiGGA 
AG6AGTCX3GC 
TCAGCATCCC 
TQOGCIGCC 
T0CX3CAGGGA 
ACGCOCGCQG 

coacrcATGC 
ccrrocGCiva 

ACAACAOCCT 
CCTGCTAl^lA 
ACACTQAGCG 
GCTGTGGAGQ 
ACEACCCACC 
ASQTGOC5CT1' 
AGSACTACGT 
CCAGCAACA6 
GCITCITAGC 



CCOCATCATA 
CTGGGCTCTQ 
TGAGGGCTCT 
OCAGGTTTCZ 
TATTDG6ACC 
GA^nSCAGCTT 



41 
I 

GAGGGATCGG 
CAACTCCOGG 
CAA0GTCAA£3 
GATOQQCCTC 
GGAOSTSGGX 
GGATOOCTAC 
OGOGCTGAAG 
TSTQAOSGOC 
GCAGCACCTG 
OCOOCSGGGOG 
CrCCAAAACA 
TGTGGAGCTG 
CXrCCTGCXAO 
CTTTGAOCTT 



51 
I 

CCOUaCATGC 

CGGCCCCCCQ 
CGTCGAGCCX3 
OGACXGGQCC 
GGCATTOGGG 
CCTOCIGGCG 
GACGCCCTAC 
GACrCGQACC 
0AAGGAGTTT 
TCIGGGOCIC 
CAAA6TGAAC 
CATCAOGOCC 
CCTGGOCaCC 
GCClGTGCaC 
GAGGTCIGGr 
CCCTCTGCAT 
TCCTGTGTTC 
TGGGCATCTC 
TGCTGCXCCA 
GaU3AAAA06G 
GAACSTGGTC 
AGGQCACXGA 
ATGGCTGCTC 
GGATTGAQAlS 
ACAGCXXX3SC 
AAGCAQAGCT 
CAAGAATAAA 



51 
1 

GCCOGCAASG 
CAGGASAAAG 
AAOOTGGAAA 
CICTTGGTCT 



CCTGACCTTC 
OOGGCATCCC 
CCQCTTAOCT 
CAACATTGAC 
CAAATTCTTC 
OGAGATCAAT 
CAAC3UU3ATC 
ICGAAXACCTC 



GAGAACCCCA 
CL'UCTKjTVACA 
TTCAGGGAGG 
OT GOASGAG S 
GGCICCCTQA 
GrAChG3>UGGA 
ATQCX3CTTCA 

GGGTGCIGCG 
GAGCC30CAOGr 
CACTCCTCOC 
GGCTTTGAG6 
AAAfiCOCAGO 
TGCSICATGGA 
TACCTGCTGG 
aOOGAGIUUiT 
ACAGlTGOCr 
TCCTAOGACr 



60 

120 

180 

240 

300 

360 

420 

4B0 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

lOBO 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

iseo 

1620 
1680 
1699 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

90O 

960 

1020 

IQBO 

1L40 

1200 

1260 

1320 

1380 
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CXZAGTQACCC ATGCCCX3GGG CAGTTCACGT GCCGCACX5C3G GOGGTGTATC CGOAAQQAGC 1440 

TGCGCTGTCA TGGCTGGQCC GACTQCACCG ACCSlCAGCJeA TGAGCTCAAC TGCAGTTGCG 1500 

ACOCCGGCCA. CCAGTTCAOG IGCAAGAACA AGTTCTCSCAA GCCCCTCTTC TOGGTCTGGG 1560 

ACAGTGTGAA CGACTGOBQA OftCAACAGCQ ACXSAGCACSGS QTGCA6TTOT CCGGCCCAGA 1620 

J CCTTCAGGTG TTCCAATGGG AAGTGOCTCT CGAAAAGCCA GCAGTGCAAT GGGAAfiGACQ 1680 

ACTGTGC5GGA OGGGTCCaSAC QAOaCCTCCr GCCCCRAGC3T GAACOTaSTC ACTTGTAOCA 1740 

AACACACCTA CCGCTGCXTTC AATGGGCXCT GCTTGAGCAA GGGCAACOCT GAGTGTGACQ 1800 

GGAAGGAGGft CTGTAGOGAC GGCTC2U3hTa AGAAGGACTG G6ACTGTGGG CT6CGGXCM' I860 

^ TCAGQAOACA GGCTGGIGTT GTUGGOQGCA CGGAIGOSQA TQABGGCXSAa TGGCCCTGGC 1920 

lU AGGTAAGCCT GCATQCTCTG GGCCAGGGCC ACATCTGCGG TGCITCGCTC ATCTCTCCCA 1980 

ACTGGCTC3GT CTCTGCCGCA CACTGCTACA TCGATOACAG AGGATTCAGG TACTCAGACC 2040 

CCnCQQiasG GACOGCCTTC CTGGGCTTGC ACGAOCAGAG CCAGCGCAGC GCCCCTGGGG 2100 

TGCAOOAGGG CAGGCTCAAG CX5CATC3VTCT OCCROCCCTT CTTCAATQAC TTCACCTlfOG 2160 

^ ^ ACTA3GACAT OGOGCTGCTG GAGCTGGAGA AAC06GCAGA GTACAGClCC ATOGTGCGaC 2230 

ID CCRTCTGCCT GCCGGACGCC TCCCATGTCT TCCCTQCCXSQ CAAGGCCATC TGGGTCACGG 2280 

GCTGGGGACA CACCCAGTAT GOAGGCACTG GOGCGCTGAT CCTTGCAAAAG GGTOAGATCC 2340 

GQQTCATCftA CCA6ACCACC TGOGAGAACC TCXZTQCCGCA GCA6ATCA0G CGGCGCATGA 24 DO 

TGTGGGTGOG CTTCCTCftGC QQCOSGGTGG ACICCIGCCA GGGn^VTICC GOaGOhCCCC 2460 

on TQTC3CA0OGT OCSWGOG^T GGOOQGAlXn' TGCAGGCaaa TQTGtTTQAGC TGGGGftGACG 2S20 

ZU GClGCGCTCa GAGGAACAAG CCRGGC3GTGr ACACftAGGCT CECTCTGTTT CGGGACTGGA 2580 

TC3UVAGAGAA CACTGGQGl:A TAGOaaccaa GGCX3«X!CAA ATGTGTACAC CTGCGOGGCC 2640 

ACCCATCGTC CACCCCAaTG TGCAOGCCXG CAGGCTGGAG ACXGGACOQC TGACTGCACC 2700 

AGGGCCX?CCA GAACATACAC TGTGAACTCA ATCTCCAGOS CTOCAAATCT GCCTASAAAA 27fi0 

OCTCTOGCTT CXTrCAOCCTC CAAAQTGGAG CTGGGAGGIA GAA6GGGA6G ACACTOOrrao 2820 

ZD TTCTACTQac CCAACTGGGG GCAAAGSTTTT GAAGACACAQ CCT0CXK:CGC CAGCCCCAAG 2880 

CTGGGCCGAG GCGOQTTTOT ' GTATATCTGC CTCCOCTGTC TGTRAGGAGC AGCGGGAACQ 2540 

GAGCTTCaaOA GCCTCCTCAG TGAAGGTGGT QGGGCTGCCG GATCTGOaCT GTOC3GGCCCT 3000 

TOGGCCACGC TCTTGflGGAA GCOCAGGCTC OGAOQACCCT GGIUWU%GA CGGGVCSGAS 30fiD 

ACTGAJAATG GTTTACCAGC TCCCAGGTGA CTTCAGTGTG TGTAITGTGT AAATGAGTAA 3120 

DU AACATTTTAT TTCTTTTTAA AAAAAAAAA 3149 
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Seq ZD NO: C23 DlilA Sequence 

lihscleie Acid Accession #: Eos sequence 

Gcading sequencer 1..22ti8 



1 11 21 31 41 sa 

1 t I i I ] 

ATGQQCCCTT TCCIGTT8CT GGAASCCGTC TGTOTTTTCC TGTTTTOCA6 A8T60CCCCA 60 

TCTCTCCCTC TCSCAGOAACIP CCATOTRAGC AAAiGAAACCA TCJOOQAaiSAT TTCa«3CT<3CC 120 

*tV AGCAAAATGA TGTGGTGCTC GGCTGCAOTQ GACATCATGT TTCTGTTfiiGA TGGGTCTAAC IBO 

AGOSTCGOSA AftSGQAGCTT TGAAAGGTCC AAGCACTTTG CCATCACaCST CTQTGACGGT 240 

CTQGACATCA GCCCCGAGAG GGTCAGAGTG GOAGCATTCC AGTTCAGTTC CACTCCICAT 300 

CTGGAATTGC OCTTOQATTC ATTTTCMCC CUVOySGAAS IGAAfiGCiAnS AATCMGAGO 360 

ATGGTTTTCA AAGGAGGGCG CACGGUU3ACB QAACTTGCTC TGAAATAiCCT TCTGCACAGft 420 

H-D QC36TTGCCTQ QAGGCAGAAA TGCTTCTGTG CCCCSWSATCSC TCATCATCGT CACTGATOGG 480 

AAGTCCCAGG GQCSA-TGrGGC ACTGOCATCC AAGCAGCTGA AGGAAAQGGG TGlCRCTOTB 540 

TTTGCTSTGa QG(3TCAGGTT TCOCAGGTGG GAGGftGClGC ATGCftCTSaC CftGOSAGCCT 600 

AOaGQGCAGC AGGTGCTOXT GGCIGAGCZAO GTGGAGGJVTG CGAOCAAGGG CC a 'C iT OWSC fifiO 

ACCCTCAGCA GCTCGQCCAT CTGCTCCAGC GCCAQGGC3US ACrTGCAfaGSr CQASGCTCAC 720 

DU Cx:xrrGTC3AGC ACAGGAOGCT GGAGATGGTC CGGGAGTTOG CTG6CAA1X3C OOCATGCTGS 7B0 

AGflGGATGGC GGCGOACCCT TGOGGTGCTG GCTGCRCfW^ GTCCCTTCTA CAGCT6GAAQ 840 

AQAGTOTTOC TAAOOCAOCC TGCCRCCTGC TACAGGACCA 0CTCCX:CAGG OOOCIGTGAC 900 

TGGCAGCCCr GCCAfiAATGa AGGCACATGT GTICCAGAaS GACIGGhCQG CTAGCftGTGC 960 

CTCMtSXGC TGGCCTTTQG AGGSGftSGCT AACTGTGCQC TOAMSCTGAG CCT6GAATGC 1020 

DD AQGI^TOGAGC TCCTCTTCCr GCTQGACAGC TCTGOGGGCA OCACrCTGGA OGGCTTCsacO 108Q 

OGGGCCAAAG TCTTOGTGAA G0GGTTT6TG OSGGCGGTGC TQAOCaAGQA CTCTOSGGCC 1140 

OGAGTGGGTG TQGCCftCATA C3U3CAGGGftS CEGCTGGIGG CGGTOXSIQT GGQaQaOTAC 1200 

CASGATGTGC CTQACCSGGT dGGAGGCXC GATGGCAITC CCTTCCX3TGG TOGC00C3UX: 1260 

CTGACGGGCA CTGCCTTGCSfi GCMaGOGIKA. GAGOGTGGCT TOQGGM3C3GC C31CX3VBGACA 1320 

OU GGCCAGGACC GGCCACGTAG AGTGGTQGTT TTGCTCACCG AflTCRCACTC caaAGGATGAQ 1380 

GTiaOOGGOC CAGCGOSTCh. OGCAASGGGQ OGAnAGCTGC TCCTGCTGGS TGTAGGCAGT 1440 

GAGG0CX5TGC GGOGAOAaCT GGAGGftSATC ACAQQCa^GCC CAAftSCATGT eftTBOTCTAC 1500 

TOGGATCCTC AGGATCEGTT CMJCCRMCSC CCTQASCTOC AGaQC3B\AGCT GIGC&GOOGG 1560 

CAGCGGCC7VG GGTOCCSGGAC ACAAGCCCTG GAGCIOGTCT TCAI6TTGGA CIACCTCTGCC 1620 

DD TCAGTAGGGC CCXJAGAATTT TGCTCAGATS CfiGAGCTTTQ TQAGAAGCTG TGCCCTCCAG 16B0 

TTTGAGGTGA ACCCIGACGT GA(3«ZAGGXC GGCCTGGTGG TGTATGGCAG CCa\GGTGCAQ 1740 

ACTGCCTTCG GGCTGOACAC CfJADOCRCC OGQQCTCSCJOA waCTGCGGGC CATTAGCCAG 1800 

GCCC3C5CTAOC TAGGI^GGGGlf GGGCTCAQGC GaCACCGCGC TGCTGCACAT CIATOACAAA 1860 

^ - GTGATGZMXS TCC^GAQGGG TGOCOGGOCT GGTGTCOCCA AAGiCTGTOGT GGTGCTCACA 1920 

/yj GGCOGGftGRG GC3BCAGAGQS TGCAGCOGTT CCTGCOCAGA AGCTQAGQAA CAATGGCATC 1980 

TCTGTCTTGG TOGTGGGCGT GGGGCCTGTC CIAAOTGAGG GTCTGOGGAG GCTTGCAGGT 2040 

CCCCGGGATT CCCTQATCCA CGTGGCAGCT TACGCCGACX: TGCGGTACCA CCAGGAOSTG 2100 

Cl'CATTQAaT GGCTGTGTGG AGAAGCCAAG CQQCCAGTCA AOCTCTGCAA ACCCftGOCCG 2160 

TGCATGAATG AGOGCAfiCTG OGTOCTGCRG AAIGGGnSCl ACOGC3X5CAA OTGTOGGGAT 2220 

/D GGCTGOGAGG GCCCCCACTG CGAQAACCOA TTCTTGAGAC GOGOCTGA 236B 

Seg ID NOt C24 DNA Sequence 
Mbclelc Acid Accession #s Bos sequence 
Ooding sequences 1..2424 

1 11 21 31 41 51 

1 I I I I I 

ATsccciCcaTr xccrGXTGcr oGAGGCCserc tgtgttttcc igttttcicag aqtgcocxjca eo 

TCTCTCXZCTC TCCAGGAAGT CCATOTAAGC AAAGAAACCA TCBGGAAGAT TTCAGCTGCC 120 
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AGCAAAATGA TGTGCTGCtC GOCTGCAGTG GACATCATOT TTCTGTTAQA. TQ{3GTCrrAAC 180 

AGOSTCOCSQA. AAGQOMCTT TCAAAGOTCC AAC3CACTTTG CCATCACAGT CTGTGAOSGT 240 

CTGGACATCA. GOCCOGAGAG GGTCAGAGTG GGAGCATTCC AGTTCAaTTC CACTCCTCAT 300 

CTGGAATTCC CCTTQGATTC ATTTTCAACC CAACAGGAAG TGAAGGCAAG AATCAAGRGG 360 

J ATGGTTTTCA AAJQGAGGGCX3 CfVCGGAG&GG GAACITGCIC TGAAATACCT TCTGCACAGA 420 

GGGTTGCCTG GAGGCAGAAA TGCTTCTCTG COCCAGATCC TCATCArCGT CACTGATGGG 4 BO 

AA6TCCCA6G GGGaTGTOGC ACTGCCATCC AAGCAGCIGA. AQQAAAGGGG TGTCACTGTG 540 

TTTOCTGTGG GGGTCAGGTT TCCCAGGTGG GAGGftGCTGC ATGCAjCTQGC CflGGGAGCCT 600 

AGAGGGCAQC AOGTGCfGXT GGCTGAQCfta GTQGAGGATG CCACCAAOOG CXITCTTCAGC 660 

1" ACCCTCftGCA GCTCEGCX:AT CTGCTCCAGC GCCACGCCAG ACTOCAGGGT CGAGGCTCAC 720 

OCXn^GAGC ACftGGACGCT QOAGATGGTC CGGGAGTTCG CTGGCAATGC CCCATGCTGG 780 

AOAGGATCGC GaGGGACCX:T OXSGGGTGCrG GCTGCACACT GTCOCTTCTA CAGCXG6AAG 840 

AGASTQTTCC TAAOCCACCC TQCCAOCTGC TACAGGAOZA CCTGCCCAOa OCXXTTGIGAC 900 

TCeCAGCCCT GCCAQAATGG AGGCaCATGT GrrCCAGJlAQ QACTGQACGG CTACCASTGC 960 

1 J CTCTGOCGQC TCGOCTTIGG AaOQGAGGCr ARCTGTGCCC TGAJ^GCIGAS CCTGGAATGC 1020 

A6GGT0GACC TOCTCTTOCT GCTGGACAGC TCTGCQGGCA CCACTCTGGA OGGCTTCClG 10 80 

OGGGCCAAAG TCTTOGTSAA GCGOTTTGTG CGGGCCGTGC TGAGCGAfiGA CTCTCGGGCC 1140 

OGAGTGQGTG TGGCCACATA C3\GCAJSGGAG CTGCXGCSTSG CC3GTGOCT6T GG6GGAGTAC 1200 

CAGGATQ^rac CTGACCTGGT CTGOAGCCTC GATGGCATTC CCTTCCaSTGS TGQCCCCACC 1260 

2\J CTGACOGGCA GTGCCTT60G GCAGGCX3GCA GAQCiraTGGCT TOGGGAGOGC CACCAGOACA 1320 

GGCCAQQACC G6CCAOGTAG AQTGGTGGTT TTGCTCACTG AGTCACACTC CGAGGATGAG 1380 

GTTGGGGGCC CaGCGCGTCA a3CA7K3GGCX3 CGAGAOCTQC TCCTGCTGGG TGTAGGCMT 1440 

GAGGCCQTGC GGGCAGAGCI GGAQGAQATC ACAGGCRGCC CAAAGCATGT aATQGTCTftC 1500 

_ _ TaSGATCCTG AOGATCTOTT CAACCAAATC CCTGAGCTGC AGOGGAAGCT GTGCAGOGGQ 1560 

-^J CftflCSGGOCAG OQTGCCGGftC ACAAGCCCIQ GACCTCGTCT TCATGTTGGA CaCCTCTQCC 1620 

TQPt^ShGGQC CCGAGAATTT TGCTCAGATG CftGAGClTTG TGA£3»flGCTG TGCCCTCCAG 1630 

TTTGAG6TGA ACCCT?GAa3T GACACAGGTC GSCCTGQTGG TGTATQGCAG OC5\GQTQaM3 1740 

ACraCCTTOG GGCTGGACRC CAAACCCACX: OGGGCTQOGA TCCTGOQOaC CATTAGOCAG IBOO 

GCCCCCTACC rAQOTOGGeT GGGCTCAGCX: GGGACCDGCGC TGCTGCRCAT CIATGACAAA 1860 

D\J GTGAaOACCG TOCAGZ^GGGG TGCCOGQCCT GGTGTCCGCA AAGCTGTGGT □GTGCTCACA 1920 

GGCGGGAGZU3 GOGCAOAGGA TGCAGOCGTT OCTGCCCAGA AiQCTGjUSGAA CAATGGGAOrC 19B0 

irCTOTCTTGG TOGTQGGOGT GGGGCCTGTC: CTAAGTGAOG 6TCTGCX3GAG GCITGCAGGT 2040 

CXICOGGSATT CCCTOATCXa CGTGGCAGCT XACGCCGACC TOOGGTAOCA CCAGGACGTa 2X0Q 

^ CTCATTQAGT GGCTGTGTGG AGAAQCCAAG CAGCCAGTCA ACCTCTGCftA AGCSCAGCCXSG 2160 

3D T6CATGAATG AGGGCAaCTQ OGTCCTGOVS AATGGGAGCT ACCGCT6CAA GTGTCiGGGAT 2220 

GGCTGOOAGG GCOOCCACTG 06AGAACGGT 0AGTO6AGCT CTTGCXtCTGr AIY^TOTGAGC 2280 

CAOG6ATGGA. l^TCTTGAGAC GCCCCTGAGG CAC2KIGGCIC CCX3T6CAGGA OGGCAGCAGC 2340 

aSTACCCCTC OCAGCAACXA CnQA^AAGGC CTGGGCACIG AAATGGIGCC IACCTTCTG6 2400 

AAT6TCTST6 COCCAGGTGC TTAG 2424 

Seq ID NO? G25 TOSOl Sequence 

laucleic Add Accession #i XM_0973e6.3 

CtxSliig 8«giiencex 142- -795 

45 1 11 21 31 41 SI 

111)11 

CTOtSCAGAAC CAOCIGGACT CTGTCOGtOT CrGTCCCCCG GCCTOCAGGG CTCCTCTOCC 60 

66GACXXX:C3G TCOCACXSCCT GGGTOCCGOG CCGGGGGAAG C33CCTGCTGC CTATCTCTGT 120 

CTACCTCAGG TCTQACTTTT GAT6CCAAAA rrCXQAnCX:CC TGGGGTGCCT CXCXXSCOQaC 180 

DU TCOCQTGCAC CAGGGTCIGC ftGCAQCCACT GGGGCCTGGC TGCCTOCIGC ATCTQGOQGC 240 

OCTOGACCCC TGOGQCOCOC GICCAOCTGC CCACCTCOGA GCCTGQGGAG GGGCXX3TQCA 300 

GGGTCGAGOG CTOGGTOGTC ICCCTGGQQC TGOGTGTGTG TGXC3QOQAAT C3CTG0GTGTC 360 

OTGTCTGTGG GOGATOCSGGC: CTCCOGGOGG TOGGrGQACC TGGATTCTAA CTCAGAGGAC 420 

_ _ TTGAGCXITGC TQTTAACTOC GATGATTOTA GGQACAGGCG GGGTGGGTQQ OGQarGGGOG 480 

CQAGGCTGGG TOOCXSGCCSCA GOAGAAGGAA GTOGCTGAAG 0C3bGTOGCCA TGCTGGCCGT 540 

GGftRftTGGGA OQOBOTTGCA GAGGGtCEAT GGGQCC066T CCTGGATACT CGGCAGGAAf? 60O 

OCOrqrCTGC AGAfiGCTCCr OCXTOCCrCA GGTGGCCCKS TTCAACCCC& GCX3GTGCCCA 660 

TCTGCIGOCA OCX?CCrGT0G GTGGGGGTTT AAATTCGGTG TGGCXTTCXG GQQTQCaVQCT 720 

/:n CAGCACOCCC OCTrAOGCaa ACTGGGAGGG GGTOGGGCS^ TtsaCcrCAGC CACGAGGACC 780 

OU CTOGATQ3GT ICXAGTTCAC TTGGGACOST GGGOaCTGGC TGCGTACTGA GIGGOIOC3CC 840 

GACSnSTCAAa GOCMU3G6G& GCXCGCCCaQ CTCTGAGATG TOlGGGAGAAA GGCGGCITCT 900 

GtaAACCTTCC GTGGGACOGQ TAAGTGGCTG TCCSUaUUkaa C3G6GAGG{?r6 GGCAQGGGGC 960 

AOGQGSGQCA GCTG GGGTCG TTGTTAAGGQ TCAOGCATCT GTACRGTTGA ATTTCCTTTC 1020 

TCTTATCATG TTTTACCCAC CTTGTOOCTr TTTTCODCAA TXQTGCTTTT GCATTTTTTT 1080 

OD CCTT GGCaAA TGTAAACTCA GOClTKaVTT CATOAOOTOT GftAMTTCfiG TTTCTCTGOA 1140 

CTTTGTCW3A CGGGGTGGGIA ACCAOGGCTG AAACTCAGGr AATAOGAGOA AAAAAAAAAA 1200 

AATTTTTAAIl AAACAXAAAA. CTgtfJCTCT A 0CICIGGCT6 GGOCCSlGCCT 1260 

GTCXOSOCXTr GGOOGCaSCiA QQOTGGCCTG TaACRATTifC AQTTTTCaaCA GAACaaTTCaS 1330 

GIAIXftAAAiG GAAMUVA I337 

8«q ZD HQs C26 XMA Sequence 

IfUclelc Add Accession #± Bos sequence 

Oodlng sequence t 95.-2128 

75 1 11 21 31 41 51 

1 ^ ) I I I I 

GQGGTAQTTT GTAOGGAOGC AGCICTGCAC GTGOSGGACX GCGAGGCTQG ACGCTAOQGG 60 

CTCCTGGAAA GSAGAQACAC CAGCATTTGC CRJCMOG^^T^ TCKTCCACTG ACTTTAC3«Tr 120 

op. TGCTTCCTOQ OAGCTTGTGG TCX5GOGTTGA CCATCCCAAT 6AAGAGGAGC AOAAAGAOGT 180 

OU CSVCACTGAGA GTAICTGGAG ACCTTCATGT TGGAGGAGT6 A!rGCTC3UU3T TAGTAGAACA 240 

OftTCAATATA TCOCAAQACT GSTCAGACTT TQCTCTTTGS TgOQAACAjGA AGCATMCTG 300 

GCTTCTGAAA ACOCACtGGA CCCTGOACAA AXAIGGGGTC CAGGCAGATQ CAAAGCITCT 360 

CTTCACCCCT CA0CATAAAA TaCTGOGCCT TOTTCXGCOG AATTTOAAGA TGGTOAGaTT 420 

GOGAGTCAGC TTCTCAGCTG TGCTTTTTAA AGCrGTOUnT GATATCIGCA AAATOCTGAA 4B0 
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TftTTAGAAflA TCAGAAGAGC TTTCCTXGTT AAAGCC6TCT GGTGACTATT TTAA<3AAGAA 540 

GAAGAAAAAA GACftAAAATA ATAAGSAACX: CRTAATTOAA GATATTCTAA ACCTGOAGAG 60D 

TTCTCCAACA GCTTCAGGTT CATCA^STAAG TCCrGGTTTA TACflSTAAAA CCATGACOCC 660 

TATATATGAC CCCATCAATG GAACACCAfiC ATCATCX3iCC ATGACTTGGT TCftGTGACAfi 720 

3 CCCTTTGACQ CRACAAAACT QCAaCATCCT GSCATTCAGC CAACCCCCXX: AGTCCCCAGA 780 

AGCftCTTGCG GATAXGIACC AGCCTOGGIC TCTGtJTTGAT AAAGCCAAGC TCAATGCAGG 840 

•rrGGCXAOAC TCCTC5U2GCT CCCTTATGGA ACAAGGCATC CAflGAGQATG AGCAGCTGCT 900 

CTTAOGATTT AAATATTATT CTTTCTTCGA CTTGAATCCT AAATATGATB CIGTCCGAAT 9^0 

^ AAAOC/^ACTC TATGAGCAAG CCAGGTGGGC CATTCTCTTA QRAGAflATTG ATTGCACAGA 102O 

lU GGAAGAAATQ TTCATCTTTG CflGCTCTACA GTACCACATT AGCaAACTGT CJGTTOTCTaC lOBO 

116AAACACAG GATTTTGCAG GC3GAGTC0GA GGTTGATGAA ATAGAAGOGG OGCTTTCTAA 1140 

TTTGGAAGTA ACOCTAGAAG GTGGAAAAGC GGACftGCCTT TTGQAGGACA TTACTOATAT 1200 

CCCTAAACTT OCAGATAATC TCAAATTATT TAGOCCCAAG TiAGTTACTAC CAAAA6CITT 12C0 

^ _ CAAACAATAT TGGTTTATCI TTAAAQACAC ATCCATAGCA TACTTTAAAA ATAAOSAACT 1320 

AO TGAACAAGGA GAACCACTAG AAAAACTAAA TCTTAGAQGC TGOGAAGTTG TGCCOSATGT 1300 

AAATGXAGCA G8AAGAAAAT irOOAATCAA GTTACTAATC CCrQTTC3CX33 ATGGTATQAA 1440 

TOAAATQTAT TTGAGATQTG ACCATGAGAA TCAATACGCX: CTVATGGAOrGG CXGCCTCCAT 1500 

GTTGGCATCG AAGGGCAAAA CCATOQCaOA CAGCTCCTAC CAGCCSU^ajGG TCCTCAACKT X560 

CCTTTCATTT CTGAGGATQA AAAACAGGAA CTCTQCftTCT CftGGTGGCTT CCAC3TCTCGA 1620 

-^U AAACATGGAT ATGAACCCAG AATOTTTTGT GTCACCACGQ TGTGCAAAAA GACACAAATC 1660 

CAAACAGCIG GCGGCCCOGA TOCTGGAGGC GCACCAGAAC GTGGCCJCAQA TGCCCCTGGT 174 O 

06AAGCCAAS CTQCSSGTTCA TCSCAaOCGTG GCAGTCACTG CCTOAGTTTG GCCTCAQCTA laOO 

CTACCTTGTC AGATTTAAAO GAAGCAAAAA AGATQACATT CTGGGAGTTT CATATAACAG 1B6D 

GTTGATTAAA ATTGATGCAG CCACXXSOOAT TCX^AGTCACA ACATGGAGAT TCIU^AAATAT 1920 

CftAflCaQTOO AATGlAAACr GGQAAAOCCG GCAGGTOOTC ATOGAGTTTG ACCAAAACGT 1980 

CTTTACTGCT rcCRCCTGOC TGAGTGCaoa TTGCAAGATT GTGCACOAGT AGATTOGCGG 2040 

CTACfllTTTC TT6TCGACX3C GCTCCAAGGA CCAGAATGRA ACACTOGATG AGGACTTGTT 2100 

CC3«aAATTG ACOGOCOGTC AGGATTGAAA CAAGOVOGCG TGCTGG13CTC ACACCAAC3^ 2XfiO 

^- GGCAAGC5CAA AGG0GC500CT CCCCAaA(3(3G ATCCCTAACG TaCXSCAiQCAT GTAlQATTCia 2220 

OV GACTAACAGA CAA CATAC AT TCACCGCTGG TCADCCAQAT OCTCATTCAA ACCCACTGCT 22a 0 

GGCACATCQC TTTCCTTACT TTGCCCTaTG CTACCAGCX:A CQGAAGQAGC CTCTCTTQTT 2340 

ITTTCT ATAA AATGGGTAGG CAGGAGAAAA GCAfiGT0C50C TAAGATTGCT CIAAGGOCXA 2400 

GC&TGTGQTT ACaGTTCTCT GACTTGCAGA ACCTGCCflGG TOTATGGCTA CAAGTTATCC 2460 

TCOTCCTGAT CXGTCrCATT ACTAAGTCAA TQQAGAAGAC AGAAA06TAA AAATCAGG1?G 2520 

TAGCAAGAAC AACTCTTATT TCfiCAAACTC AGGTATGAAA COAAAOGCCT GTCCTTCATG 2580 

GAACTQCTTT TAGCTCCTGT CTTTTCAAAA rGQCAQAGGG AGTTCCTACA CACACTTTTT 2640 

CCXMGRGSC OUSGTCTAG GGGTAQAAAG GGGA6GGGTG GGOCTACCAG GTAGC5M3TTG 2700 

ACAACCX^AG GTCJiGAGGRG TGGCOCTCftG TGTCATCTGT OCACACTGAT AGCTGCGAAS 2760 

ATGACCACTG A£XX3\CATCT GGTCTTAQTC ATTQ6TCTCC rC3«3ATTTCT GGGGCCAGCT 2B20 

*IU GCAAOOCCCA TTCCATTCCT ACAGATCTCT CftflCXaiCCTG TAAGTCCrrT GTGAAGATGT 28B0 

GGGTGACS^ OGQGGACAOG AAAACX3CATT TCTCAACCCA OATCXaTGTC TCCACTGCTT 2940 

CEACTCTGOG TTGGGATTCA GGAAGACAGG CACRGTCCTC TCTGiaJCATA GAAACACCTG 3000 

OCftOTGTCAA GGATTCCAGT CABGTOTCTA TCCCAACTGS TCaOGGAiGAG AAGGGCaOAC 3060 

. CCATTCTCRA A^SkCCACCAT GTCCAAGGTC TQACAGCTCC CXSUTrGGCTQ OCXX!C&CAGG 3120 

GGCtTTAOGC TGGTCTGQGT CKPGGGGAAG CGTC(Xn:CrT ATOGCTQGTC TOTGTTCTCC 31B0 

TOaATPTGGT ATCIATOTTG 6TACGACTCC TGGCCTTTTA TCTAAAOGAC TTTGGCTTTT 3240 

GTAAArrCACA AGCCAftirAAT AQACTTTTTT CTCCCCCTCT GTTTTTTGCT GTGTCATCTC 3300 

TSCXTTTGAGA CTGCC TTGAO ACRSTGCTTG CCTTGAGAGA GTGAGOCAAT TAACASCraC 3360 

_ - CTGAATTGIC ATTTTCCATT TTOaTTTGTT AGAGGTGOGA GC3G3TGQGTT TTEGAGAAGGT 3420 

Ov CAAAAGCAAT ACCAGAAGXA AAlGGGAAATA TCAQACAATA TTTTATTATT TTTTCATAGA 3400 

7GTTGTGCCA CACAAAGAAC TTQOaSTOTA AGGATAAGGC AAAAGCTCCA ATCCCATTTT 3540 

TCAfiTTCTCC TAgGA IGCZAC CCXTTCAGGGA GCCTGGCXS^O AGTTOCGAGS CCCGT6AGCG 3600 

TCAGCTGTTG CTTTATTTTC CAICSAAAGCC CTCTGAGAAG TGaSAOCTCR GCAATTCCS30 3660 

_ - GAGCC3U2ATA G AGAC AGACT TOGCAAGGGA CCCXXTrOOTT CTGAQCCAGT AGCTGCCATC 3720 

DD TQQAAATTCC TCTTTXAGOC TCTOCTTAflA GOTOAATGTG AATGAAGCCT OOCAGGCSiOC 3780 

CSCTGAATTT CTGAOGCXrTT GCSTAAAGCT CAGAAGTGGT TTAGGCATTT GSAAAATCTG 3840 

(aXTCACATGA TAAAOAACTr GAX!i:TaAAAT OTITTCrATA GAAACAAOTG CTAAGIQTAC 3900 

OSTATTATAC T TGATG TTGG TCATTTCTCA GTOCTATTTC TCAGTTCTAT TATTTTAGAA 3960 

C CTAGT CAGT TCrTTAAGAT TATAACTGGT CCTACATTAA AATAATQCTT CTCGATGTCa 402Q 

OU GATTTTAOCT OXTTGCTGCT GAQAACATCT CTGCCTAATT TAOOUVAGOC AG&CCTTCAG 4080 

TiCSUUaTGC TTCCrCftGCT TTTCATAGTT GTCTGACATT TGCAIGAAAA CAAAGGAAOC 4140 

AACTTTSTTT TAACCAAACT TTGITTGQTT ACAGTTTTCA GGGGAGCX^TT TCTtCCRTGA 4200 

CaCACRGCAA CATCOCAAAG AAATAAACAA GTGTQACAAA AAAAAAAAAA AACAAAOCTA 4260 

^ AATOCTACTG TTCCAAAGAG CAACXTGATG GTTTTTTTTA ATACTGAGTG CAAAAGGTCA 4320 

OJ COCAAATTCC TATGATGAAA TTTTAAATTA ATOeOCACCr TTCAACATCA ITTQCTTCCT 4380 

lA!rCKACAGT TGAXTCAGAA ATCIGCATIT TTOATTCTTT TATATGACTT TTAAGTAAAA 4440 

GATTTATA!EO GATTTGAAAA A3UUUUUUUM A 4471 

^ ^^'^ Protein Sequence 
/U Pteteln Accession #: HP_005l6l.l 

f 11 21 31 41 51 

'''III 

MDGGnUPRSA PPAPPV PVGC AARRRPASPE LLRCSRRHHP ATABTGGGAA AVARSKEREH 60 

/J HRVKLVHLGP QALftQHVPHG GASKKLSIOrS TIiRSAVEYlR ALQRIxLAEHD AVIiKALAGGL 120 

RPQAVRP8AE RSPFGTTPVA ASPSSASSSF GRGG&SBFGS ERSAYSSDDS 6CEGALSPAB 180 

RBXiLDFSSHIa GGIf j^93 



80 
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AACTTTAGAG AAAG6AAOGG CCAAAACTAC GACTTGGCTT TCTGAAACX^G 60 

AAGCRTAAAT GTTCTTTTCC TCCATTTGTC TGGATCTGAG AACCTGCATT TQGTATTAQC 120 

TACSTGOftAGC AGTATGTATG 6TTGAA6TGC ATTGCTQCAG CIGGTAGCAT GAGTGGTGGC 180 

CRCCftGCTGC AGCTQGCTQC CCTCTOaCCC TGGCTGCTGA TGGCTACCCT GC3U3C5CftGGC 24D 

5 TTTGGACGCA CAGGACTGGT ACTGGCAGCA GCGGTOOAGT CTGRAAGATC AGCAGAACAG 300 

AAftGCTGTTA TCAGAGTGAT CCCCTTQAAA ATGCSACCCCA CAGGAAAACT QAATCTCACI 360 

rrGGAAQQTG TGTTTGCTGG TGTTGCTGAA ATAACTCCAG CAGAAOGAAA ATTAATGCAG 420 

TCCCACCCRC TGTACCTGTG CAATG0CA15T GATOACQACA ATCIGGAGCC TGGATTCATC 480 

AGCATOBTCA AGCTOGAfiaG TCCTCGAOGG GCXXXX:CGCC CCTGCCTSTC ACTGGCTAGC 540 

10 AAOGCTCGGA TK3GCGGGT6A GC3GAGGAGCC AGTGCTGTCC TCTTTGACAT CACT6AGGAT 600 

COAGCTGCTG CTGAGCAGCT GCAGCAGCCG GTGOQGCTGA CCTGGCCAOT OGTGTTDATC £G0 

TGGGGTAATG ACGCTGAGAA GCTGATGGAG TTTGTOTACA AGAACCAAAA GGCXXATGTG 720 

AGGATTGAGC TGAAGOnSCC CCCS6GCCTGG CCAGATCAT6 ATGTGTGSAT CCTAATGACA 7 BO 

GTGGTGGGCA CCATCTITGT GATCATOCTG GCTTGQGTGC TX30QCATCCG GTGGOSCCXX: B40 

15 CGC3CACRGCA GGCOGGATCX: GCTTCAGCAjS AGAACRGCCT GGGOCXTCAG CCAGCTOGCX! 900 

ACCAGGAGGT ACCAGGCCAG CTGCAGGCAG GCCCGGOGTG AGTGGCCAGA CTCAGGGAGC 960 

AGCraCAGCT CAOCCTOCTGT GTGIGCCATC TGTCTGGAGG AGTTCTCTGA GGGGCAGGAG 1020 

CXAGGG6TC21 TTTCCTGCCI OCATGM3TTC CATQQIARCT GTQTGGnCCC CT€GTTACKT 1080 

CAGCATGGQA CTTBCCCCCX CiGOGTGTTC AACATCACAG AGGGAGATTC ATTTTCCCAS 1140 

20 TCC3CTQGQAC CCTCTCGATC TTACCAAGAA CCAGGTCGAA GACTCCACCT CRTTCGCCAG 1200 

CATCOCGGCC ATQCXXaCTA CCSVECTCCXTT GCTGCCTACC TGTTGGGCCC TTCCCGGAGT 12 SO 

GCftGTGGCTC GGCCCCCAOG ACCTQGTOCC TTCXTTOCCAT CCCAGGAGCC AGGCATGGGC 1320 

CCTCTOCSlTC ACCGCTTGCC CAGAGCTGCA CATOOOOGGG CTOCAGGAGA GCAGCAGGGC 13 8D 

CTGGCAGGAG CCCAGCACCC CTATGCACAA OGCTGGGQAA TGAGCCAOCT OCAATOC21CC 1440 

25 TCACAQCACC CTQCTGGTTG CCCAGTGCCC CTACGCCGGG OCRGGCCCCC TGACRQCAQT IS 00 

GGATCTGQAG AAAGCTATTG CACAGAAGGC AGTGGGTACC TGGCAGATOG GCCAGCCAGT 1560 

GACXCCAGCT CAGGGOCXTTG TGATOGCTCT TCCAGTGACT CTQTGGTCAA CT€CACGGAC 1620 

ATCftGOCTAC AGGQQGTCCA TGGCAGCAGT TCTACTTTClr GCAiGCTOCCT AAQCAGTQAC 16B0 

TTTOaCOCCC TAGTGTACTG CRGCCCTAAA GOaOATCOCX: AGCQAGTGGA CATGCAGCCT 1740 

30 AGTGTGAC5CT CTCGGCCTCn TTCCTTGGAC 1CXX5TGGTGC CXXCAGGGGA AAOCCAGGTT 1800 

TCCAGCCATG TCCACTACXA CCGCCAC5CGG CAGCACCAjCT ACSiAAAASCG GTTOC3VfiTGG 1860 

CATOGCAGGA AGCCTGGCCC Al&AAACCGGA GTCGCCCAGT CCAGGCCTCC TATTCCTCQG 1920 

AC:AQU3CCCC AiSCXlACAGCC AOCITCTOCT GATCAQCAAG TCAGCGQ2VTC CAACICAGCA 19B0 

GCCOCrrCOG GGOGGCTCTC TAACCCACAG TGCCGCAGGG CCCIOCCIGA GCCAGOCCCT 2040 

35 GGCGCflGTTG AOGCCXGCAG CATCTGCCCC AGTACCAOCA GTCTSTTCAA CTTGCAAAAA 2100 

TCC3W3CCTCr CTGCCCSaACA tXCTiC!RGBJGG AAAROGOGGG GGGGTCCCTC OGAGCCCACC 21fi0 

CCTGQCTCrrC GGCCCCftGGA TGCAACTGTO CRCXXAQCTT QOCAGATTTT TCCCCATTAC 2220 

AGCCCCAGTG T6GCATATCC ITGGTCCCXJk GAG6CACACC CCTT6ATCTG TGGAOCTCCA 2200 

. GGCCIGGAC;^ AQAGQCTGCT ACCMSU^C CCA0QCCCCT GTTACTrCAAA ITCACAGCCA 2340 

40 GTGTGGTTGT GCCTGACTCX: TCQCCftfiCCC CTGGAACCAC ATOCACCTGG GGAQGGGCCT 2400 

TCrCGRATGGK GTTCCGACAC CGCAGAGGGC AGGCCATGCJC CTTATCCGCA CiGCCAGGTG 2460 

CTGTCGGCCC AGCCTGGCTC AGAGGAGGAA CXOGAOGAGC TGTGTGAACA GGCIGTGTGA 2520 

GAlGltCMSG CCIAGCTOCA AOGAAISAGTG TtSCTCCAGAT GTOTTTaGGC CCTAOCTGGO 2560 

ACAGAGTCCT GCTCCTGGGA AAGOAAAGGA OCACACCAAA CACCATTCTT TXT60CGTAC 3640 

45 TTCCrTASAAG CACITGGAAGA GGACTGGTGA TGGTGGAGGG TGAGAGGGTG CCBTTTCCTG 2700 

CTCCAGCTGC AGACCTTGTC TGCAGAAAAC ATCTGCAGTG CAfiCAAATCC ATGTCCAGCC 2760 

AGGCA/^CAG CUGCSQCX^EQ TGGQGTGIGT GGGCTGGATC CC?n<?AAGQC TGAQTTTTTG 2820 

A0G6CAGAAA GCTAGCTATO OGTAGCCAGB TGITACAAAS GTGCTGCTCC TTCTCCAACC 2080 

CCTACTOXSGT TTCCCTCAjCC CCAAGCCTCA TGTTCATACC AGCC»QTGGQ TTCAGCAGAA 2940 

50 CQCMGACAC CTTATCACCT CCCrCXTTTOG GXGAGCTCTG AACAOCAGCT TTGGCCCCXC 3000 

CACAGTAAGG CTGCTACATC AGC3QGCAACC CTGGCTCTAT CATTTTCCTT TTTTGCCAAA 3060 

AGQACCAGTA GCATAGGTGA QCCCTGAGCA CTAAAAOSAG GGGTCCCTGA AGCTTTCCCA 3120 

CTATAGTGTG GAGrTCTGTC CCIGAGOTGO GTACRGCAGC CTTQGTTCCT CTCGGGGTTG 3180 

AGAATAAGAA TAOrGGGOAG OGAAAAACTC CTCCTTGAAG ATETCCTCTC TCAGAGTGCC 3240 

55 T^GAGAGGTAO AAAG6AGGAA TTTCTGCTQS ACTTTATCTG GGCAGAGGAA GGATGGAATG 3300 

AAGGTAGAAA AGGC3U3AATT ACTkGCIGAGC GGOGACAACA AAGAGTTCTT CTCTGGGAAA 3360 

AGTTTTOTC7 T^SAGCAAGG ATG6AAAATG OOGACAACAA AGObAAAGCA AA6T6TGAGC 3420 

CTTGGOTTTG GACAGCCCAG AGGCCCAGCr CCCC2VSTATA AGOCATACAQ GCCAGGGAC^C? 3480 

CACAGGAGAG TGGATTAGAG CACAAGTCTG GCXTTCACIGA GTGGACAAGA 6CT6AT6GGC 3540 

60 CTCATCAGGG TGACATTCAC CCCAGGGCA0 OCTQAGCACT CTSGQCCCUT CA0(3C3VrTAl! 3600 

CCX31TTTGOA ATQIGAATQT GGTGGCAAAG TGG6CACAGG AGCCCAOCTG GGAACCTTTT 3660 
TOCCICAGTT AGTGGOGAGA CTAGCAGCIA GGTACCiCACA TOQQTATTTA TATCTGAAOC 3720 

AOAlCnGAGGC l^OAATCAGG CACXATGTTA AGAAKEKIAT TTATTTGCTA ATATATTTAT 3780 

^_ CCACAAAAAA AAAAAAAAAA AA 3803 
65 

Seq ID 2!lO: C2d Protein Sequeiice 
Protein Accession #x NF_0042eo.2 

70 1 11 21 31 41 51 

1 1 I I I I 

meSAHYHVN PSQAISOrWN TiHBftTT^TiCPN NTFRRDPTAR TSQSQBPFLQ UflSHmfPSQ 60 
TIjPGim^TGF laSPVDBlHMRN LTSOpliZrYCD DIHIFBSI^ KSLATEDNFD PIXWSQIiFDH 120 
^. FD8DSGLSLD GSSOifSrraVXK SNSSH8:VC3>G GAIGYCTDHB SSSHHDDEQA VGG^PSSSK 180 
75 liCHLDQSDSD FRCaOiTPQEV FHHBTXHI^F n^PEGTSEPf PWPGKSQKXR SSyT^EDTDRN 240 
LSBDBQRARA LHIPFSVIW IPGMFVDSFNS MbSRVYT-TPI* QVSLZRDZRR RGKNKVAAQS 30O 
CRKRKLDIIIi HLBDDVCNLQ AKKBTIrKRBQ AQCNKAINIM KQKUIDIjYBD XF9BXJIDZ3QG 360 
RFVnPNHyAI. QCTHDGSXX*! VFREL-VASGH KKBTQKCSKRK 400 

80 Seq IB NO: C30 DNA Sequence 

nucleic Acid Accession ft: UH_004442 
Coding Beguence: 19.. 2982 
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GCCCCGGOAA GCGCAGCCAT QGCTCTGCGG AGGCTGOGGG CCGOGCTGCT GCTGCTGCCG 60 

CTGCTC3GCCG CGGTGGAAO;^ AAC3GCTAATG GACTCX».CTA CRGCGACTGC TGAGCTQCSGC 120 

_ TQGATGGTOC ATCCTCCATC AGGGTGGQAA DAGGTGAGTG GCTACGATTaA GAACJ^TGAAC 180 

J CGTACCAGGT GTGCAAOGTG OPTTGAGTCaA GCCAGJUICRA CTGGCTACGG 240 

ACCAACTTTA TCCSGOGCOG TGGOGCCCAC COCATCCAOG TGGAGATGAA GTTTTCGGTG 300 

COTGACTGCh GCAGCATCXX: CAiGOGTGCJCT GGCTCCTGCA AGGAGACCTT CAACCTCTAT 360 

TACTATGAGG CTGACTTTGA CTCOaCCACC AAGACCTTCC CCAACTGGAT GGAGAATCCA 430 

TGGGTGAAGQ TGGATACCAT TGCAGCCGAC GAGAGCTTCT CCCAGGTOC3A CCTGGGTGGC 4 BO 

lU OQCGTCAT6A AAATCAACAC CGAGGTGOGG AGCTTCCGAC CTGTGTCCCG CAGCGGCTTC 540 

rACCTGQCCT TCCAGGACTA TGGOGGCTGC ATGTCCCTCA TCGCX33TaC3G TGTCTTCTAC 600 

CX?CAA6TGCC OCOSCATCAT CCAfJAATGGC GCCATCTTCC AQGAAACCCT GTOGGQGGCT SCO 

GAGAQCACAT CGCTGGTGGC TGCCCC3GC3GC AfiCXGCSVTOQ CGAATGCTOA AGAGGTGGAT 720 

GTACCCATCA AGCTCTACTQ TAACOGGGAC GGCGAGTGGC XGGTGCCCAT CGGGOC3CTOC 780 

ArrGTGCSUUVS CAGGCTTCGA QGCX:GTTGAG AATGGCACOQ TCTacCQACJO TTGTOCATCT B40 

□GGACTTTCA AGGCCAACCA AGGOaATORQ GCCTGTACOC ACTGTCCCAT CAACAGCCGG 90 0 

ACCACriCTG AAGGGGOCAC CAACTGTGTC TGCC3GCAATQ GCTACTACaG AGCAGACCTG 960 

QACCCOCTGG ACATGOCCTG CRCAACCATC CCXJTCCGCGC Ca:AGGCTGT GAITTCX»QT 1020 

GTCAATGAGA CCTTCCCTCAT GCTGGAGTGG ACCCCTCCCC GCSQACTCOSG AGGC06AGA6 1080 

GftOCTOGTCT ACAACRTCAT CTOCAAGAGC TGTGGCTCGG GCOGGGC5TGC CTGCACCCGC 1140 

^EGOGGGGACA ATGTACAGIA CGCAOCACGC CAGCTAQGCC TGACC?GAGCC ACGCSllTTAC 12 DO 

ATCAGTOAOC TGCTQGCCCA CACCCAGTAC AOCTTCGAGA TCCAGGCTGT QAACGGOGTT 1260 

ACTGACCAQA aCCX:CTTCTC GCCTCA6TTC 6CCTCTGTGA ACATCACCAC CAACXIAGGCA 1320 

QCTCCATCGG CRGTGTCCAT CATQCATCM GTQAGCCGCA COGTGGACftS CftTTACCCTO 1380 

TOSTGGTCCC AGC!C3«3AOCA GCCCAATGGC GTGATCCTGG ACTATQAGCT 6CAGTACTAT 1440 

BAQAAGGACC TCAGXGAGTA CZAACGOCACA GCCATAAAAA GCCCCACCAA CACX3GTCACC 1500 

GTGCAGGSCC tCAAAGOGGG OSCCATCTAT GTCTTCCAGG TGG^GC3«33 CACa3TGGCA IS 60 

GGCTACOGGr GCTACAGOGG CAAOAITGTAC TTCCAGACX». TOACAGAAGC OGAGTACCAG 1620 

ACAAGCATCC AGGAGA7U3TT onCACTCATC ATCJGGCTCCT CGGCGGCTGG CCTGaTCTTC 1680 

CTCATTGCTO TQGTTGTCAT GGCCATCGTG T6TAAiC!AGAA GAiCXSGGGGTT TGAGCGTGCT lV4t) 

GACTGGGAGT ACAOeGAOAA 6CTQCAACAC TACACCAGTG GOCACATGAC CCCAGGCATG 180 0 

AAGATCEACA TOaATCCTTT CAOCTROGAG GACCOCWUOG AGGCaGTGCJG GGAGTTTGCC 1B60 

AAGGAAATTG ACATCTCCTG TGTCAAAATT GAGCASGT6A TOGGAGCAQG QGAOITTGGC 1920 

GAGGTCTQCA QTGGCCACCT GAAGCTGCCA GGCSlAeiifiAQ AGATCTITGT GGCXSITCAAG 19B0 

ACGCTCAAGT CGGGCTACAC GQAiGAAQCAG O0COGGGACT TCCTGftGCGA AGCCTCCATC 2040 

ATOGGCCAGT TCQACCATOC CAAOGTCATC CACCTGGRGQ GTGTCQTGAC CAAGAGCACA 2100 

CCTGIGATGA TChTCACOGA GTTCATGQAG AATGGCTCCC TGGACTCCIT TCrCOGGCAA 2160 

AACGATGGGC AQTTCACAGT CATCCAGCI6 GTGGGCAIGC TTC3GGGGCAT OGCAGCTGGC 2220 

. ATOAAGTACC TGGCAGACAT GAACTATGTT C3VCCGTGRCC TGGCTGCCC6 CAACATCCTC 2280 

4U GTCAACIVGCA AGCTGGXCTG CAAGGT6TGG GACTTTGGGC TCTCA06CTT TCTAGAGGAC: 2340 

OMACCTCAG AGCGCAOCTA CACXSUOTGOC CIGGGCOGAA AGATCCCCAT CcaETTGGACA 2400 

GCOCOGGAAG CCATCCAGTA C06GAAGTTC ACCTCGGCCA GTaATGTGTG GAGdAGGGC 2460 

ATTGTCATGT GGGAGGTGAT GTCqTATOGG GAGCGGCCCT ACTGOGACAT OACCAAOCRG 2S20 

GATGTAATCA ATGCCATTGA GCAGGACTAT CSGCTGCCAC CGCCCRTGGA CIGCCCXSftJGC 2SeO 

GCOCXGCAOC AACICATGCT GGACTGTTGG CAGAAGGACC GClRACCaCC30 GCCCAAGTTC 2€40 

GGCCAAATTG TCAAC7U3GCI AGACIAASATG ATCGGCAATC CCAACAOCCT CAAAGGCATO 2 700 

GCJBOOCCTCT OCXClGaCAT CSiACCTGOOG CTGCTGGRlGC GCAOGATC3CC COACTACAOC 2760 

AOCTTtTAAtA CGGTGGAOGA GTOGCTOGAG QCCATCAAGA 7X?GGGCAGTA CAAGGAGAGC 2B2D 

TTCGCCAATG CCGGCTTCAC CTCCTTTGAC CTOGTGTCTC AGATGATGAT QGAiaaACATT 2880 

CTC OGGG TTO GGCTCACTTT GGCTGGCGAC CftlGAAAAAAA TCCTQAACAG TATCCAGGTG 2940 

ArtQCSaoaCGC AGKTGAACCA SATTGAOTCT OIGGftGCTTT GACATTGAOC TGOCTCX3GCT 3 000 

CACCTCTTCC tTGCAAGCX^CC GGGCCdCTS CCOCAfiGTGC CoaCCCTCCT GCSTGCTCTAT 3 060 

CGACTGCSGG GOCAGOCACT OQCCAIsaAGa OCAGGGOGCA CGGGAAGAAC CAAGCGCTTGC 3120 

CAGCCACGAG AOSTCACCAA GAAAACAT6C AACTCAAAC36 AOGQAAAAAA AAA66GAATG 3180 

GGAAAAAAOA AAACAGATCG IGGCAGGGGQ CQGGAAATAC AJVQGAATATT T^miAAAGAG 3240 

GATTCTCATA AOQAAAOClkA. TOACTOTTCT TGGO0GG6AT AAAAAAGGQC TTGGGAGATT 3300 

CAXGCQATGT GTCCAAT06G AGACAAAAfSC AOTTTCTCTC CAACTCCCTC OMSaOAAtiGTC 3360 

ACCTOGOCAG AGCCAAOAAA CACTTTCAJGA AAAACAAA1G TGAAGGGGAG AGACAGGGGC 3420 

CX3O0CTT6QC TCCTGTCCCI GCTGCTCClC tCAGGCX3C3iC TCAACAAOCA AGCGCCTOOA 3460 

OU GGAGQGGACA GATGGACSkGA CAGCCACCCT GAOAAOOCCT CTGGGAAAAT CXATTCCTGC 354.0 

CACCACIGQ8 CSAACAOAAO AATTTTTCTG TCITIGGftfiA GTATTTTAGA AACICCAAT6 36O0 

AAAGACnCTC TTTCTCCTSr TGGCTCACAG GOCTGAAAGG GGCTTTTGTC CTCCIGQQTC 3660 

AGGGAGAAGG CGGGGACCCC AQAAAGGTCA GCCTTGCTGA GGATGGGCAA CCOCAGGTCT 3720 

GCRGCTCCAa GTACATATCA OGOGCACAGC CTGGCAOCCT GGCCCTCCTG GIGOOMnxi 3780 

CCX3CCASCCC CTGCXrcCGBG GACnaATACX GCAOTGACIG COSTCAGCTC COACTGCOGC 3840 

TG2)GAAGaaT TOATCCTGCA TCTGGGTITS TTXACASCAA TTCCTGGACT GOGGGGTATT 3900 

TTGGTCACA6 GGTGSlTL'TTG GrTTAGGGOG ITTtfrTKSlT GG&rr&ITTT TTGTTTTTTG 3960 

GTTTTTTTTA ATGACAATGA AGTGA<A,CTT T6ACATTTCC TAOCTTTT G A GGACTTGATC 4020 

CTTCTOCAGG AAGAASGTGC TTrCTGCTTA CrQACTT2U3G CAATACRCCA Ai3GG0GANAT 4080 

/U ITTTATATGCA CATTTCTGGA TTTTTTTAIA CGGTTTTCRT TGACftCTCTT OCX3CCTCCC 4140 

ACCTGCCACC AGGOCICACC AAAGCCCACT GCXSOGGGGC CASCKQQGCC ATTCZUiAGAC 4200 

TOGnGlOAGA TTTQG6TGTG GAQOGGOAGG CGCCAftGCTG GMSGAfiCTTC CCACTCCAGQ 4260 

ACTGTTGATO AAAGGGACAG ATTGAGGAGG AA6TGGGCTC TOAGGCTGCA GGGCIGGAAG 4320 

TCCTTGCCCA CTTCOCACTC TC5CXGCCCCA ATCTATCTAG TACTTCCCAG GCAAAXAGGC 4380 

GCCTTTOAGa CTCCTGAaTG OOCTCAGATG GTCAAAACCSC AOTTTTCCCT CTGGGRGCCT 4440 

AAACCAGGCT OCAT0GGAO8 CCAGOACCCQ IMCATSCAC TGTGATACCC TSCCCTCCAO 4500 

ACSGGlGOBCr CS^GAGACftCG GGCAAGCAT6 CCTCTTtSOCr TOCCTGGAGA GAAAGT6IGT 4560 

GATTTCTCTC CCACCTCCTT CCCCCCACCA QACCTTTGCT GGGCCTAAAG GTCTTGQCCA 4620 

TGGGGACGCC CTCAGTClAG OGATCTGGOC ACAGACTCCC TCCTOTQAAC CAACACAGAC 4680 

OU AOCCAAGCAG AGCAATCAGT TAGT13AAIT6 GAATTCCCCA AGTCTTTGCT AIlGTGAATA 4740 

QTGCTGCaiAT AAACKEAOGT GIGCRTOTOT CTTTKO^A GAATGftTCTA XAATOCTCTC 4800 

GGTATGXACC CMTAA TBGG ATTGCIGGGT CAAATOGTTT TTCTGGTTCT AGATCCTTGA 4860 

GGAATTOCCA CACEGTCTTr CACAATGOTT GAACXftATET ACACCCCXAC CAACAOTGTA 4&20 

AAAGTCTTCC TGTTTCTGCA CSITCCTCTCC AflCATCTOTT OTTTOCTQAC TTTTTAATGA 4^80 

1216 



45 



50 



55 



65 



75 



wo 03/042661 



PCT/US02/36810 



10 
15 

20 



30 



35 



40 



55 
60 
65 



75 



80 



TTGCCATTCT AACTGGT6TG AGATGGTATC TCATTGTGGG TTTGATTTGC ATTTCTCTAA 5040 

TQACCAGTGA AQATGAGCTT TTTTTCA.TAT QTTTGTTQGC CACATGrTTG TTGTITCTTT SlOO 

TGAGAAGTBT CTGTTCA.TAA CCTTCaCC2&A TTTTTGATGA GGTTGTTTGT TCTTrPCTTa S160 

TAAACTTAAT G7VAATA2UU3C ATGAAGhCAA GATTfUQAAGA. AAAAGAATGA. AAAGGAACAft 5220 

ACAAAGCGTC CAAGAAATAl* GGGACTATGT GACAAGAACA AACTTAOGTT TGACTGGTET 52fl0 

GfCTQAAAATG ACAGQQJiaAA TQAAACCAAG TTtSOAAAACA CTCTICMSGh TATTATCCAG 5340 

GAGAACTTCT CCAACCTAfiC AAGACAGACC AACGTTCAAA TTCAGGAAAT AC3VGAGAACA 5400 

CCCAAAGATA TTTCTOGAGA AGA6CAAOGC GAA6ACACAT AATTGTCftGA TTCACCAAGG 54^0 

TT6AAATGAA GGAAAAAAT8 CTAAGGGCAO CC&0AGAGAA AGGTCAGGTT ACTCACAAAA 5520 

GGAAGCCCAT CAGACTAACA GCRCATCTCT CT<3CA(5AAAC CCTACAACCT AGAAfiaSAGT 55BD 

GG3GAOC3WVr ATTCAACATT CACAAAOAAA AGAATITTCA ATCCftGAATT TCATATCCA6 5640 

CCA AACTA AB CTTCATAAGC AAAGGAGAAA TA2UUVTCCTT TACAGACAAG CAAAtCCTGG 5700 

GAGATTTTC^r OlCCACCAGO CCTQCCTTAC AACSU».TCCT GAAGGAAGCA CTAAATATOG 5760 

AAAAGAAAAA CTGGTGOCAG CCACTGCAAA AAATACCAAA TTOTASAGAC CATTGACAjCTT 5820 

ATORAGAAAC CGTGTCAACT AATOOOCaiAA ATAACCAGCT AiGTATCATAA TQACAGGATC 5880 

AGATTCACAC ATAACAATAT TAACCTTATA TCXARATQaG CTAAATCCCC CAATTAAAAG 5940 

ACaC3U3ACTG GCAAAITGGT TAAAOAQTCA AGACTCATTG GTCarGCOtsXA TTCAGGAGAC 6000 

CCKICICAOS TGCAAAGACA CftCATAGGGT CAGAGTAAZkA GGGATACAGG GGAftTTC ISO 57 

S«q ID NOt C3X DNA 8eguence 

Nucleic Acid Acc«BS±on NM_03a942,l 

Coding sequence i 145, ,1260 



1 11 21 31 41 51 

25 1 I I 1 ) I 

CCOGASCCCC GCCCCTCCGO GCCCGGGTOS GCGCX5CCCAQ CCTQCCAQCX: GOGCTGCTGC 60 

TGCTCCTCCT GCTGTGGGAC OGClGACCDGC aCGGlCTQCTC OGCTCTCCCX: GCTCCAAGCB 120 

CSCGATCTGGG Cf&CCOGCCAC CAOCWTQGAC GCTOGCOQOe HGCOQCAOAA AQATCTCAGI^ IBO 

GlAAAtanAjQA. ACTTAAM3AA ATTCAGATAT GtGAAGTTGA TTTOCATGGA AACCTGGTCA 240 

TCCTCTCATG ACAfiTTGTGA CAGCTTTGCT TCTGATAATT TTGGAAAC3U7 C3A13GCrGCAG 300 

TCAGTTCGCa AAOaCTGTAG GACCCGCftGC CACSTGCawSQC ACTCTGGACC TCTCAGOGTG 360 

GCGATGAAGT TTCCA60GCG GAGTAC3CAGQ GGAGCAACCA ACAAAAAAGC AGAGTCCCGC 420 

C?U30C!CTCAG AGAATTCTGT CACIGILTTCC AACTCCGATT CAGAAfiATGft AAGIGCaATG 480 

AATTTTTTG6 AOAAAAGGGC TTTAAATATA AAOCAAAACA MGtaiKTGCT TGCAAAACTC 540 

ATQTCTGAAT TAGAAAGCTT CXXTrGGCTCXS TFCCGTGQAA GACATOCOCT CCCAGGCTCC 600 

GACTCRCAAT CAAGGAGACG QCGAAGGCOT ACATTOCCGG GIGTTGCTTC CAGGftGAAAC 660 

CCIGAACGGA GAGCTOGTCC TCTTACCAGG TCAAGGTC3C3C GGATCCIOGG GTCCCTTGAC 720 

GCTCTACCCA TGGAGGAGGA GGAGGAAGAG GATAAGThCA TGZIGGTGAG AAAGAG6AAG 780 

AOCXnOGAIG GCXACATGAA TGAAGKTGAC CTOCCCASAA GC30GTGGCTC Q^MXATCC 840 

QTGACCCTTC CGCATAIAAT TQGCCCAGTG G7VAGAAATTA CAGAGGAGGA GTTGGAGAAC 900 

GTCTGCA6CA ATTCTCGAGA GAAGATATAT AACCGTTCAC TGGGCTCTAC TTGTChTCRA S60 

TQOOGTCAGA AGACTATTOA TACCT^AAACA AACTGCAGAA ACCX3U3ACTG CT0G0G06TT 1020 

OGAGGCCAX3T TCTGTGGCGC CTGOCTICX3A AAKXXIZTATG GTGAAGAGGT ChGGGZUFGCT 10 BO 

CTGCTGGATC CGAACTGGCA TTG0CX3GCCT TGTOSAQGAA TCIGCSUlCTG CAGTTTCTGC 1140 

CGGCAOCGAG ATGGAOGGTG ^TGOGACTGOa GTC3CTTGTGT ATTTAGCCAA ATATCU'GGC 1200 

TTTGGGAATG TGCiATGCCZA CXTGAAAAGC CretfAACRGG AATTTGAAAT GCAAGCATAA 1260 

T ATCTGG AAA ATTTGCTOOC XGCCTTCXAC TTCXCRAATC TTTCTOGTAA AAQTTTCCftA 1320 

TTTTTTCACr GAAACCTGAG TTAAAMTCI TGATGATCAO CGZOTTTCAT AAGAAACTCC 1380 

AATCauUSXTA ATCTTAQCnG ACIWCGIJGTTT CTGGftGCftTC ACftGftAGGTA TMIQCTAOT 1440 

DU TAC ACTTT GC CCXOCaXSCAG TTTCTTCPCT GCTOOCAACC COC31TCTCAT AGCRTOCCCG ISOO 

TCIATTTCCA ATGCTCCTCT CCAACOGCTT AjSTTTCTQAA TTTCTTTTAA ATTACAGTTT 1560 

TATGAAAGCA rATTTTATTT ACTTQOTGTT GAAATAJSCOC ^ECAThMfkJCC XAAGCACITG 1620 

GAAACACAAT AATAGTATTA ACTAACTAGA TCTATTGAAT TTCASAGAAG AGGCTTCTAA 1680 

CTTGTrTACA CAAAAAGGAC TATGATrrAG CACTCATACT ASZTGAAATT TITAATAGAA 1740 

TCAAiGGCACA AAAGTCITAA AACCATGTOG AAAAATTAGG TAATTATTSC AGATXGAlfGT 1800 

CPCTCAATCC CATGTATTGC GCSmVTGTTA CAflOTTGTTG TCACAGTTGA QACTTAATTT I860 

CTCCTAATTT CTTCTGCOOG AAGGGTAAGT GQTGCQTCCa OCTTACftCGR ICATAATTCA 1920 

AAGGTTGGTG GGCAATGTAA ^ACTTAATTA AAAIAKTCSKC GGIMGAGCXA TCIGGAGATT 1980 

ATGAGTAAGC TGATTTQAAT TTTCACTATA AAACTTTAGT ATAATTGTAG TTTSCAAAGr 2040 

TTATTTCAGT TCAraTGTAA GGIATTGC!AA ATAAATTCXT GGftCAATTTT GTATGOAAAC 2100 

trrOATATTAA AA2\CrAGrCT OTGOTTCrTT GGAGTTTCTT GTAAATTTAT AAACCAGGCA 2160 

CAASGTTCAA OTTTAGATTT TAAGCACTTT rEATAACAATa ATAAGTGCCT TTTIGGAGAT 2220 

GTAACTTTTA GC3«5TTTGTr AACCTGACRT CTCTGCCAGT CTAGITTCTO GGCAGOTTTC 2280 

CTGTGTCAGT ATOJOCJCCCrC CTCTTTGCRT TAATCAA^JT A3TTGGTAGA OGTGGAAXCT 2340 

AAGlXSTTTGT ATGTOCaATT 'EACTTGCATA TGTAAACCAT TGCTGTGCCA TXCAATGTTT 2400 

QATGCATAAT TGGAOCXTGA ATCGATAAOT GIAAATACAG CTTTTGATCT GTAATGCTXI 2460 

TAITACAAAAG T7TATTTTAA "EAAIAAAATO TTTGTTCTAA AAAAAAAAAA 2510 



^ Seq ID KOi C32 DHA Sequfixicfi 
/U Nucleic Acid Accession #8 lOM 012445.1 



Coding nequences 276.. 12 71 

1 11 21 31 41 51" 

I I [ I 1 1 

GCAOGAGGGA AGAGGGTGAT CCGACCCGGG GAAGSTOGCT GGGC3U3GGCG AGTTGGGAAA 60 

GCX5GCAGCX:C GGGCCQCCCC OGCAGCCCCirT TCTCCTCCTT TCTCCCACGT CCTKTClGGC 120 

TCXaSCTGGA GGCCAGGCCO TQCAGCATOG AAGACAOGAG GhACTGGAGC CTCATTGGCC 180 

GGCCCGGGGC GCCGGCCTCG QQCTrAAATA GGAGCTCCGG GCTCIGGCTO GGACCCGACC 240 

GCTGCOGGCC GOGCTCCOGC TGCTCXTTGCC GGGTGATGGA AAAOCCCAGC CCGGCCQC5CX3 300 

CCCtGGGCAA OGOCCTCXGC CCTCXCCXCC TGGCCACTCT COGCGCOGCC GGOCAGCCTC 360 

TrGGGGQAGA GTGCATCTGT TCCGCCAGAG GCX^GBGCCZU^ ATACAGCATC ACCTTCAGGG 420 

OCAAGTGQAG GCAGAQGGCC TTOGOCAAGC AGTACCCCCX OTTOCGCCCC CCTGGOCAGT 4 BO 

QGTCTTCSGCT GCTGOGGGCC GQGCATAGCT GGQAClACAa CATGXGGAQO AAOAACCRGT 540 

AOGXCAGTAA GGGGCTGOGC GACTTTGCGG AGOGCSQGCGA GGCCIGG606 CTGATWUSG 600 

1217 



wo 03/042661 



PCTrtIS02/36810 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



AQATCGAGGC 
TCOCCAGCGG 
TCTCQTTTGT 
ACCTGTGQ3A. 
CCGOGMXSGA 
CGGTGACCQA 
GGCTGAAGGC 
GGfGCCTTCAT 
CCTCAGtTCC 
GCQQAGGCCA 

ACTQCGTCTA 
GGCTCCTOra 
GACCGCGGTG 
GGCArrOGGA. 

AATTATGQTC 
CCTGGCTCOC 
CTCTCCCGAfi 
GGAAGCGTCA 
TGCTCAC 



GGCOG6GG9U5 
CACXX3GGCA6 
C30TGCGCATC 
OGGGC3ACOGT 
CAGCX5GCTTC 
GATAAOSIOC 
CCXGCCTCCX: 
CCCTCCCGCC 
AGAAACGCCG 
CTGTGQGAGG 
aSGGfiGCCCC 
AGACCAGAGC 
CAGGCTCATG 
AGGCCQCQCC 

AfiCRGtxrrcx: 

GOCTCCICCT 
TCCTTATAAG 
ACGTGGTTOC 
GQCGCATCCA 
GlXSTTTCCAT 



ACGTCGGGGG 
GTGCC5CAGCC 

ACCITCTCCT 
TCCTCTGCCn. 
ATOGCCAGGG 
CCAGTOCTGC 
CXGGACTGCG 
CTOGGGACCA 
TGCCCCIBAGC 
CCOGCAGCCC 
CTGCAGGCGG 
GACCATCXCT 
TCCTTTCOCA 
OCTQCAGGAT 
TTATTGCTGC 
AGATACCXCA 
AGGOGOGGCX: 
6TTATGGATC 



GCQTGCACGC 
AGCrrGGAGOT 
CCGACTGGTT 
AGGCXa3C3GCT 
CCCCCAACTT 
GCChCCCXSBC 
TGACRCTGGT 
CCAGOVGGGA 
AGGTCTCCCr 
AGAiGC&iGGAC 
TGGAAGAA6A 
CT«313GCCCC 
COGAGGCACA 
GCACTGAAGS 
ACCTTGCTTC 
AAAOTCATCC 
TCXZAGGAiGAT 
GACCTQQTGC 
ACTTGAGAAG 
TCTCTGOGTT 



Seq ID NO: C33 I3KA Sequeuce 

Nucleic Acid. Acceseion #s Bos sequence 

Coding ^eqpience: ]L..iai4 



1 
I 

ATQTTACAGG 
AAACCCCGTA 



TACTTCCTCT 
CTGGACTGTC 
GCAGTGGCA3 
GGGAACIGGT 
AGGGRGATBQ 
QATCTGGATG 
GGGOCCTGTC 
AAQAOCCCCC 
AGCATOCAGT 
CTCACQGCAGr 
GGCrCAGACA 
TTCAACCCCA 
ACTTTCTCJVQ 
QOCftOOCGAC 
GACaVTACXGC 
GGQTACCAjSG 
GACAOCTGCC 
GTGQGCATCG 
AAGGTCTCAQ 



11 
1 

AxccrroACAcs 

TCCOCATOGA 
CGACTATCAT 
GCGGGCAGCC 
CCTTGOgGGA 
TCCGOCTCTC 
TClCTGCCrrG 

ttgttgaaat 

TCTCAGGCTC 
t3T6TGGTGQG 
AOQACAAACA 
CCCftCTGCTT 
AACTGGGCAG 
TGTACCXXSA 
QCACA&TCA6 
TCTGGATCAT 
TGCAdSGCOrC 
GGGAAGTCAC 
AGG6TGACA6 
TTAGCTGGGG 
CXTATCTCftA 



21 

I 

TGATCAACCr 
GACCTTCAGA 
CATTGTGGTT 
TCTCCACTTC 
GGACSAGGAG 
CAAGQAC06A 
TTTCX3ACAAC 
GAAACCCAC7 
CACAGAAAAC 
CCTQQTCTCC 
TGGGGAGGAG 
GCACGTC1X3T 
CftGGaflA CAT 
CITGCXA7CC 
AGACAATGAC 
GCC3CATCTGT 
TGGhTGGOaC 
A9TCCAG6TC 
OGAGAAQATO 
TGGTGGGCCC 
CTATGGCTGC 
GTG0ATCTAC 



"31 
I 

CrOAACAGCC 
AAOGTGGGGA 
6TCCTCATCA 
ATCCOGAGGA 
CACTGTGTCA 
TCCACACIQC 
TTCACAGAAG 
TTCftGAGCra 
AGC3CaGGAGC 

CTGcaicrrGTc 

GCCTCTOTGG 
OGAGOGAGC3V 
ACCI9VTGT6T 
CTGGCTGTQa 
ATC3GCCCTCA 
CTGOCCTTCr 
TTTAOGAAGC 
ATlGACaiQC^ 
ATGT6T6CAG 
CTGATOTAOC 
OGGGGCCOGA 
AA'IGICXOBA 



GGIGIirTTGG 
GCAGCGCAOQ 
OGTGGGOGTG 
GGACCT3TAC 
GGCCACCATC 
CRACTCCTTC 
GCGGCTGOSA 
CAATGAGATT 
GTGGTCGTCC 
TCGCTAOGTC 
OQCTGAGTaC 
OGGAGCCATG 
oqqggtttog 
GCCCTCTGGT 
TTAQGGGCCC 
CCAAGGCrCC 
TQTCCTTCAT 
TCTAGGCTGr 
TQAATAAATG 
TGAATAAAOA 



41 

I 

tc3gatgtcaa 
tccccatcat 
aggtgattct 

AQCAiGCrrGTG 
AGAGCTTOCC 
iU3GTGGTGGA 
CTCraSCTQA 
TGGAGATTGG 
TTOSCATOOa 
TTGCCTGTGG 
ATTCTTGQOC 
TCCTQGACOC 
TCAACTGGAA 
CCAAGATCAT 
TGAAGCTGCA 
TTGATGAGGA 
AGAATGGAGG 
CACOGIGCAA 
GCATCCCGGA 
AATCTGACCA 
GCACCCCft<?G 
AGGCTGAGCX 



Seq ID NO: C34 UNA Sequeaoe 

HticXeic Acid Accession #s 1IM_003045.1 

Coding sequence: 14B..2037 ^ 



1 
I 

CGATCCTGCC 
CIGAGACATC 
CSQTCATATTC 
CIU3ATGCTGC 
CEQAACTVCTr 
GTCCTGGCTG 
ATGOCXGOOC 
C9CCAAGAOGG 
ATCAOOQGCr 
TGOAOOGCCA 
ATGACTCTOA 
ATTCTCATCr 
ATATTCACTT 
OGATGGGTTA 
TOTTTGAACA 
TTCTCTQGTG 
ATOQCCACCA 

ATOCCCTACr 
TGGGAAaOTS 
CTAGGTTCCA 
TTTAAATTCr 
TOSGGTGCOG 
ATGTCCATTG 
TAOCAOOCM? 
CCAGCAfiACC 

Ga^anGATOT 



11 
I 

GGAGOCCOGC 
TTTGCTGCAA 
CAGCTC3X3AA 
GG056AAG6T 
TTGKKSTGOT 
GAOCTGTGGC 
TGGCCXCaGr 
GCTCAGCTXA 
GGAACXXAAT 
CCITOQACGA 
ACGCC0CC3GG 
TGACRGGACT 
GTATTAAOGT 
AAAACTGGCA 
ATGACACftAA 
TCCTGTOGGG 
CAGGTGAAGA 
TGATCTGCrr 
TCTGCCTGGA 
C CAAG TACBC 
TGTTTCCCAT 
TAOCCAAOGT 
TTGCIGCXGT 
GC3UCTCTCCT 
ASCAGCCTAA 
AAAATGAATT 
TCTCTTTGAA 



21 
! 

GGCOGCOGGC 
GATOGAQGCT 
CAGQUUCATQ 
GQTGCW^TGT 
GGCOCTGOGG 
CGGTOAGAAT 
GCTGGCTGGC 

ccrcrAC3U3C 

CCTGTGCTAC 
GCTGATAGGC 
OGT6CTGGCT 
TTTAACTCTT 
OCrCGXCCTQ 
OCTCACBGAS 
AGAAIGGGAA0 
QGCAGCGACT 
GGTGAAGAAC 
CATOGOCTAC 
CAATAACAGC 
AOTDGCOGTG 
GCCTOGGGTT 
CAATGATAGG 
GATGGCCTTC 
GGCTTACTOG 
OCIGCTATAC 
GGCAAGCAGC 
AACCATACTC 



31 
I 

TTCGATTCTG 
GTCCTCTQCar 
GGGTGCAAA6 



GFTOGGCSUSCA 
GCAGGOCXTTG 
CTGiaCTATG 
TAT6TCAO06 
ATCATCSGGTA 
AGAOCCATGG 
<SRAAAC3CCX33 
OGT6TGAAA6 
GGCTTCATAA 
GAGGAXTTTQ 
CC06GTGTT6 
TGCTTCTATQ 
CCACfiGAAGG 
TXllGGGGTGT 



ATCTATQCCA 
ACCAAAACAC 
CrCTTTGACC 
TTGGnSGCTG 
CAGAXGGCXA 
AATGATTCOC 
TCACOCAAAA 



41 
I 

AAACCTTOCT 
GA6AAGGTGG 
TCJCTGCTCAA 
AGA060GGCT 
CACTGGGTGC 
CCATTGTCAT 
GOGAG1T7GG 
TTGGAGAGCT 
CTTCAAGC3GT 

QGGno^crc 

ACATA7TCX3G 
AQTOGGCCAT 
TOGTSTCAGG 
GGAACACATC 
GVGGJOTCAT 
CCTTCGTGGG 
OCATCCCOQT 
€3GQCTG0CCT 
AOGOCTTTAA 
GOGCTCTTTC 
T6GCTGAGGA 
CAATAATOGC 
TOAASGACTT 
CClGTUlViT 
OTACTICCOA 
AGCXBGGGTT 
ACATGGAGGC: 



gogccx:gccg 

C3LCTCGCTOG 
GACAGCCTGG 
CCCTACGACX3 
COGCAGGACA 
TACTACCCGC 
CAGAGCCCCA 
GTAGACAGCG 
TGGGGACTGr 
CJGGGTCCAGC 
GTCCCTGATA 
GGQTGTCGGG 
CGCTGCTCCT 
GGCOGGCAOG 
CCGTGTCXKSG 
AGCTACTCTA 
OGTCCAGGGG 
GCTQAGCCCA 
GGGCX3GITTC 
CXATCTCIGT 



5X 
1 

ACCXXJFGCGC 
CATAGCACTA 
GGATAAATAC 
TGACX9QAGAG 
OGAAGGGCCT 
CTOGGCCy^ 
GACftGCCTGT 
CCC!AGAC3CAG 
GAACTCAAGT 
GAAGASCCTG 
TTGGCAGGTC 
CCnCIGGGTC 
GGTGOGGGCA 
Q^TCATTGAA 
GTTOCCACrC 
GCICACTCCA 
GAASATOTCT 
TQCAGAOGAT 
AOGGGGTOTQ 
GTaOCTTGTG 
AGXAXACACC 
GTAA 



51 
I 

TGTATCCCTC 
TGAGGCTTCC 
CATTGGGCaua 
GTCTCGCTGC 
TGBTGTCTAC 
CTCCTTCCTG 
TGCXOGGGTC 
CTGGGCCrTC 
AGOQAGGGCC 
AOGGACACAC 
AGTGATCATA 
GGTCAACAAA 
ATTTGTGAAA 
AGGCS3GTCTC 
GCCCTTOGGG 
CTTTOACTGC 
GGGGATOGTG 
QV2GCTCATG 
GCAGGTGGGC 
OSCCASTCTT 
TGGACTGCTA 
CACATTAGQC 
GGTGGACCrC 
GGTCZIAOGG 
CGAGTTK3AT 
TTTACCAGAO 
TTCCAAAATC 



660 

720 

780 

84 O 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

156D 

1620 

1680 

1740 

IBDO 

1807 



£0 

120 

ISO 

240 

300 

360 

420 

460 

540 

600 

660 

720 

7ao 

840 

doo 

960 
1020 
1080 
1140 

laoo 

1260 
1314 



60 

120 

180 

240 

300 

360 

420 

480 

S40 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

12 DO 
1260 
1320 

13 BO 
1440 
1500 
1560 
1620 



1218 



wo 03/042661 



PCTAJS02/36810 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



ATTSTGACCG 
CTCGCAGGGT 

TTCGTGAAtB 
TGGATGCTGA 
TCCCTGGA.11S 
ACAOOCXXIGC 
CACCtXACCC 



TTGTGAACAT 
■XGCTTGGJ^AG 
CTGCCCTOCT 
AGCTCTCATT 
TCTATCTCRT 

taqgcttcat 
cx;gaccaagc 

CCCCCQGW3G 
TOOCCAGCAG 



TTCAACCAGC 
GG2VGGCTCTC 
CTQTGCCGTG 
TAAGGTTCCC 
QATGCA6CTG 
CATCTACTTT 
AAGGACTCXT 
•TOGCAGCACSC 
TGCAAlCAGAA 



CTTATAGCTG 
ACCAAAOGGG 
GTCAOOQ6CX3 
TTCCTOCCAG 
GACCAGGGCA 
GOCrrATGGCC 
GACGGCAACT 
CC0GAGG6AC 
ACCACCTGCa 



Seq ID NOi C3S DHA Sequence 

nucleic Acid Accession #s NM_002776-a 

Coding sequence? 83.. 52 2 



1 
I 

ACCAGOGGCA 
GGQGACXCCC 
GCCCX5(JGCrC 
GCGCTOCTCC 
CGCGGCTCGC 
GTGCTOGTGa 
GCTOGAGTAG 
CGCrCTGTTQ 
GATGAGCACX; 

TfSGGGCACCA 
ACIATCCTGA 
ATATGTSCTG 
6TCT6TGACG 
CAGCATCCAQ 
CGCTOCAACT 
CAOATGCCCA 
TGTCTGCACT 
CATTCCCCCA 
CARAGC3TTTA 
CTGQQQTCAC 
ASTOOCCTCT 
TCTTAGAGAT 
OTCAT6XAAG 
AAAAAAAAAA 



11 
I 

GACCACAfiGC 
AOATCCTGGC 
TGGCGAAGCr 
CCCAAAAOGA 
AGOCCTGGCA 
ACCAGAGTTG 
OG6ATGATCA 
TCCATCX;CAA 
ATCTCftTGTT 
AGCTTCCCTA 
C3GGCOGCCOS 
GCXX^'AAAGA. 
GACTGGACOG 
AGACCCTCCA 
CTGTCTACAC 
GATGCAGATG 
ISnOGCTCCAT 
GITCAAACCr 
CCTATCOCCA 
TTCCAGAGAA 
CCAACCTGaVC 

ctgaacctca 

QTTOTCAGGA 
GCTTAACACA 
AAAA 



21 
] 

AGGGCAQAGQ 
CATGAGAGCr 
GCTGCCXSCrtS 
CAOGOGCTTS 
GGTCTDacrC 
GGTGCTGAiGG 
CCTQCXGCTT 
GTACCACCRG 
GCTAAAGCTG 
CCQCTGTtaCT 
GAGAiGTGAAG 
GTGTGUUQGTC 
GGGCCAG6AC 
AGG CATCCTC 
CCftGATCTGC 
CTACXSCTCCa 
OGTCGATCGT 
CTQCOGOCCT 
TTCTCTC3C3Cr 
GCCAGGRAGC 
TTCCTCTGOC 
GTITCCTCAT 
GACTATGATA. 
aXGGGTQQTQ 



31 
I 

CACOTCTGGS 
COGCACCrOC 
CTGATGQCXSC 
GAC0CXX5AAG 
TTCAACGGCC 
GCCGCGCACr 

cTTcauaoGca 

G6CTCAGGCC 
GCCAISGGCCQ 
CAGCCCGGAG 
TAX3yU3UU»3 

OCTTGCCAGA 

TarroaaaTG 

AAATACATGT 
GCTGATOCAO 
CTTCCTCCCC 
CCACACCTCT 
GTACTGAAGC 
OQGTC3WCAC 
ACTOCCCGCT 



TTCTCATCAT 
C9CTGTQGGC 
TCATCXGGAG 
TGCTCCOCAT 
CCXGQGTCCO 
TGTGGCACAG 
TGGACCAGTG 

TCCACACCCr 



41 
I 

TCCGCTCCCT 
AOCICTCaGC 



CCTATGGCGC 
TCTOQTTCXyV 
GCGGAAACAA 



CCATCCFGOC 
TAGTOCCSOOG 
AOCAGTGCCA 
GCCT6ACCTG 

GTGACICTG6 
TTTACCCCTG 
CCTGGATCAA 
ATGTTAT6CT 
A6TOQGCIGA 
AAACATCTCX; 
TGAAATGCAG 
CCAOCCTCTG 
GTGTGRCXTT 



TAACKTGTCT 
AGTTCTQACT 



ATGTAAATCT 
AAAGGTTACC 



CACCTTCTGC 
AGICITTCT6 
GCAGCCCQAO 
CCTGAGCATC 
GTTTGCTGTS 
CGAGGAC3GCG 
CAAGT6ACSC 
GACOOGGAGG 
CACTQCA 



51 
I 

CCTTCCTATC 
GGGCTCIGGC 
CGC&QUSQCQ 
C5CCGXQCGCG 
CTGOGC3GGGX 
GCCACTGTGG 
CSQBQACGACr 
AAG6CGAA0G 
GDCCCnOOTC 
GGTTGCTGGG 
CTCCAGCSkTC 
CftAC3UUAT6 
AGGCCOCCTG 
TtJCaCTCTGCC 
TAAAGTCATA 
CCTGCTGATC 

CCTCTCACCT 
GAAGIGGTGG 
AGAGCAGTTA 
GGGCftAGCCA 
OOTGCCTAOC 
TCRTGTGATT 
T0TTGT0GT6 



Seq ID NOs C36 I^A Sequence 
VUdelc Add Acceaaion ft a 2H_0950&fl 
Coding eequences 1. .4074 



a. 
I 

ATGACGOOOG 
ACGGOQGGCC 
GACIXSGCAaC 
GAQCIGCCAA 
GGTGTACeCr 
TC3C3QCTGaAA 
GGATTCTlSGG 
AiCX^AGCGGOG 
GTGCTGCTAC 
CAACAAGTGC 
GGOC3TQGGGA 
IQCSCTGGGGG 

QOGCxx:ccTr 

CTCATGCTGA 
ATCACQCaSC 
AGCCAIQCAOG 



CIGCGGCTGT 
GCXSGTGCTOC 
CAOCAA2UBGC 
ACCXGCOOQA 
GGAGGGCCCT 
GTGGCACCTG 
GOAGACATCT 

ArOTTAlSAGA 
ATCGCCTOTC 
AAACACCCAQ 
GATTCACTOC 
TTAAAAAAAT 
CTTAACCAAT 
GTCTTCAGTA 
TTGAAATGTA 
ATAATTCATA 
TCCTCAAGCX3 
GAATGTQGCA 



11 
I 

COGOraCAGC 
TGCCGCQCX3C 
GCGGCTGOGT 
GOGGGGIGCT 
CGOQC0QM3G 
0GGAGAGC9C& 
GATCCCCOGC 
TGGACTGOGG 
AaOGTTTGOA 
AGGAGOGCCA 
(90C3CTGQCCC 
AG^TOCTOGC 
GCTCltSCKOT 
AGGAQCAGAA 
TGGAGCZ^GAA 

ATGGaQGcx:r 

T6CT6ACAT6 
GAGCQGACAO 
IC^GGGTCCC 
C&GC2U3AGQ3 

GTCCCCAGGA 
CAGIKSGCCAA 
QTCCCCT*CTG 
GGGAfilGSCA 
ACTftC3U3AAA 
TGGAGCAAAA 
TTACATGTTC 
AAAAAGTAAT 
OIIGTAAAAG 
GTTTGTCAAA 
AATTTTCAAA 
AAAAATATGG 
CTAAGGAGAA 
GTACTACACA. 
AAGCCTTTAG 



21 
I 

OGAGOCXaGGC 
CITOCTGCRG 
GCACCTGCGA 
QGAGOaCCTG 



GGGCGOAGSC 
GOACCCCTGT 
CCEGCTGAAG 
GATGAXGG08 

ACTGGGAOGG 
TGAGGCGTGT 
GAlCXSTCCACC 
OCGACICCTC 
GZCOGOQCXC 

GGCTGGGGCr 
TGGGGGAGCC 
ATCTCCAGGG 
CgOST GGCAO 
ClIltSI^GICC 
GGGGGATGGC 
AQTQACACXX: 
TGCCAAGGGA 



31 
I 

AGOCTGOGCA 
GAGATCCAGr 
AQCCAaOGOC 
C0CGGG0GG6 
AdtmOCAOCG 
GOCOGAGGG6 
CAGATGAAQG 
CAGGGG060S 
GSCCASAlQCA 
CTACTGCCCA 
GCCCSGTCOGQ 

AOCCAGGAGG 



CCTCTTCTCC 
TAAAGAGCCC 
TCATTTGAAC 
AOCAAQAACA 
AGTAGATGAG 
TACCOUUIAC 
TTCCAATAGA 
CAAATCATTT 
GTCCTACAAA 
TAAAAGRATT 
GTGGCXCTCA 



OTOCCCCACA 
CTCTTGAGTC 
CTOCAOGAGT 
AAAAGAACTO 
CTOQGA7GOG 

AGCTGGAGCG 
AAOC(?7GAGG 
CTGTTGACAT 
C3VTf3CTCAGC 
CTGGGXA3GA 
CAOAATATAA 
CAAGACCTTC 
TATGQAAAAT 
TGTGAG6IGC 
AAAATATTTC 
CACAATGCAA 
TGCATGTTTT 
TGTGA2^GAAT 
CITACXGGAfi 
AAGCTTACTA 



41 

I 

OOGOCTCGOC 
OCXTrGTTAGA 
CCCTCIGGGT 
GCGGGOGOCA 
GCCGAGGGGT 
GCCTQCAGCS 
AACACOGGAG 
AOCTOanGCA 
ATTGGTACCA. 
GAGCCSUSGOC 
AGGTACAGQA 

CCTCCCAGCA 
TGACCGAGAA 
TATT3X?lkGaC 
TTGAGGCCIT 
CX3CATAGTCC 
TACCAGATAC 
CTCACOCaO 
G0GCTGAG8C 
OCTCTGAISCC 
ACCTQGGCGC 

TCAGAGACAT 
A6AATTTATA 
CTGTCTCIAA 
AGAGAAATGA 
AGCCAGA6CA 
GTGOACATQA 
ACAAASGAGG 
AGACTCKIAA 
GATATACTGG 
CACACCTAAA 
GTGGCAAATC 
AGAAACGCTA 
GACIVTAAaAG 



51 
I 

OGCQOGCRGC 
CATCCTGCaAC 
GGAAGOQCGS 
GC0C3GGAGCA 
TOCAGAGOQC 
GCTTGSftGGC 
GCACAOCATC 
OQAQAAGOAG 

GcnocnecrG 



GSTGGCCGS5 
ATCCTCCTCC 
GCAG2VrCATC 
GAjQTGAGOGC 



GAOOGGTIT 
GCAOCTA^TA 
CTGGTTTCCA 
GCTGCftCTTC 
TOGGOaCGAG 
TQACTOCCTC 
AGG6TTTGAT 
GAGTOGGCAC 
AGCTATAGAA 
TAGAGATGTG 
GCCAGACmG 
GATOGCAGCC 
GAOCATAAAA 
GAATTTACAA 
TTAXAA3KSAC 
ATGTGTCAAA 
AAASAAACAT 
TCAACSmZAG 
CTTTAACCAC 
aUUTQTGAG 
AATTCACACT 



1680 
1740 

laoo 

1860 
1920 
19fiO 
2040 
2100 
2157 



60 

120 

IBO 

240 

300 

360 

420 

480 

540 

£00 
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240 
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GOAQAGAAAC CCTAOCSCATQ TGAAGRATCT GGCCRAQCCT TTAGGCGCTC CTCAAOlCrr 2220 

ACTAACCACA AOAGAATTCA TACTGQAQAG MACCCTACA AATGTSAAGA ATCTOGCAAA 2280 

GCCTTTAGCa TATCCTCAQC CCTCATTTAC CACAAGAQAA TTCATACTGG AGAGAAAOCC 2340 

- TACACATGTG AAGAATQTGG CAAAGOCTTT AACTGCTCCT CGACTCTTAA GACACATAAG 2400 

J ATAATTCATA CTGGAGAGAA ACX:crACACA TCTGAAOAAT GTOGCAGAAC CTTTAACTQC 2460 

TOCTCAACTG TAAAGGCACA rAAGAOAATT CATACTGGAfi AJSAAACCftTA CAAATISTGAA 2520 

GAATGTGACA AAGCTTTTAR QTGGCATTCA AGTCTTGCTA AACATAAGAT AATTCAC3VCT 2580 

GGAQAGAAAC CCTACAAATG CAQTGACAGC AAAGCCTTAG CCAAATCATC AOAAGTGCAA 2640 

- ^ AAGGTCrACT CTOGAGATGG GGAAAATGOA ATCCGTGTAC ATAAGAAAAA GGAfiACACAG 2700 
L\J GGCTGGCTTG TGAGAAACAA GAACX5AAAAT AGAACAGGGC TGTTCCftSAT OGGGGCIGCC 2760 

GTGAGACCTA ACAGQGACCC TTCATGGGGA CftGCAAG&Afi GTTCACTGftC TGACCCAATT 2820 

CAQACGAAGQ AGGAAOCTGA CCTTCAAAAT CRCTATOACC ATCRGAATGC CTTAGAAGAT 2680 

CAAAGAAATA CTOGAGTGGG TGGACTGTTG ACATTCAGAG ATGTAGTCAT AGAATTCTCT 2940 

^ CTOGAGGftOT GGCAATGCCr GGATCACGCT CAGCAGAATT TAlATAGAiGA TX3TGATGTTA 3000 

SS^iJi?''^*^ GAAAOCIGGT CTCCCTGGGT ATTBCTGTCr CTAAGCCAGA CTTGATCACC 30€0 

TGTCTGGAGC AAAATAAAGA GCCTTGGAAT ATAAAGftQAA ATCAQATQST AACCAAACAC 3120 

CCAGACCTTC CGCCAGAGCT AGGCATAAAA GATTCACTCC AAAAAJSTAAT ACCAAOAAGA 3180 

TATGGAAAAA GTGGACATGA CAATTTACAA OTAAAAACAT GTAAAAGCAT QQaTGAGTGT 3240 

OAGGTGCSiAA AAOGAGGTTa TAATGAAOTT AACCAATGTT TOTCAACTAC CCAAAACAAA 3300 

jtU ATATTTCAGA CTCATAAATG TOTCAAAGTC TTOOGCAAAT TTTCaAATTC CAATAGACAT 33C0 

AASAiaAGAC ATACTGGAAA GAAACATTTC AftATOTIVAAA AATATGQCAA ATCATTTTGC 3420 

ATFGGTTTCAC AACTACATCA ACATCAGATA ATTCATACTA GGGAGAATTC CTAOCAATOT 3480 

GAAGAATGOS GCAAACCCTT CAAClGCTCT TCAACCCTTT: CTAAACATAA AAGAATTCAT 3540 

- _ ACTGC3AGAGA AACCCIACAG AT3TGAGGAA T6TGGCAAAQ CTTTIACXZTG GTCCTCAACC 3600 
^-D CTTACraAAC AXAGGAGAAT TCATACTOOA GAAAAACCCT ACACATGT6A AGAATQTGGC 3660 

CAMCCTTTA GCOGCXCCTC A2iCACTTGCT AAOCACAAOA GAATTCAIAC TGOAGAiSAAA 3720 

CCATACACAT GTGAA*3AATG TOGCAAAGCC TTTAfiCTTAT CCTCATCCCT CACTTACCAC 37B0 

AAGAQAATTC ATACTQGAOA GAAACCCTAC ACATGTGAftCS AATGTGaCAA AGCCTTTRAC 3340 

TGCTCCKJAA CCCTTAAGAA ACATAAGATA ATTCATACTQ GAGAGAAACC CTACAAATOT 3900 

AAAGAATCTG GOAAAGCCTT TGCCTTCTOC TCAACTCTTA ATACTCAaAA GAGGATTCAT 3960 

ACTCC5AGAGG AACCCTACAA ATOTGAAGAA TGTGACAAAG CTrTTAAGTG QTCCTOMST 4020 

CTTGCTAATC ATAAGAGTAT GCATACrrOGA GAGAAACSCCT ACSlAATGrOA ATAA 4074 

esq ID NOs C37 DMA Sequence 
J-> Nucleic Acid Accesalon #i UM 032044 
Coding sequAzice^ 182.. 658 ^ 

I 1^ 21 31 41 51 

40 AAGATATAAA AGCTCCAiSAA AOGTTQACTG GGACCACTGG AOACACTGAA GAAG0CAGGG 60 

GCCCTTAGAO TCTTOSTTGC CSUkACAGATT TGCAGATCAA QGBQAACOCA QGAGTTTCAA 120 

AGAAGCOCTA GTAAGGTCTC TOAGATCCTT GCACTASCTA CAT0ClCP«3G CTAGOAGGAA 180 

GATGGCTTCC AGAAQCATGC QOCTGCTCCT .ATTQCraAGC TGCCTGGCCA AAACAGGAQT 240 

. CCTGQOTGAT ATCATCAT3A GACCCAGCTG TGCTCCTQGA TGGTTTTACC ACAAQTCCAA 300 

TTGCTATGQT TACTTCAQGA AGCTGAjGOAA CIGQTCTGAT QCOGAGCrOS AGTGlCaGTC 360 

TTACGGAAAC QGAGOCXaCX: TOGCATCTAIT CCTGAGTTTA AAQGAAflCCA GCacCATAGC 420 

AQAGTACATA AGTOGCTAIC AGftGRAGCC^ GOCGATATGG ATTGGOCIGC AOGACOCACA 4flO 

GAflfiAQGCRG CAGTGGCAQT GGATTGATQ6 OGCXXIG'TAT CTGTACftSAT CCTOGTCrGG 540 

SO ^i^^^ GGTGQGAACS^ AGCACTGTGC TGAGATOAGC TCCAATAACA ACTTTTTAAC 600 

-JU TTGGAGCAGC AACGAATGCA ACAAGC3GCCA ACACTTCCM TCCftADTACC G&CX»TAGfifi 660 

CSiAQAATCAA GATTCTOCXA ACTCCTGCaC JM3CCCG(3TCC TCTTCCTTTC TGCiaGCCTO 720 

GCTAAATC7X? CTCATTAlTT CRGRGQI3GAA ACCTAGCAAA CTaWSMPTOA TAAGQGCCCT 780 

ACTACACTGG CTTTTTTAGG CTTAGAGACA GAAACTTTAG CAlTt3GCXXA GTAGTGGCrr 840 

CTAGCTCTAA A1GTTTGCC3C CGCCAl-CGCT TTGCACAGTA TCCTTCPTCX: CTGCTCCCCT 900 

GTCTCTGGCr GTCTC»AGCA GTCTAGAAGA QTGCATCTOC AGOCTATGAA ACAGCTSGGT 960 

^^^^l ^^a^^ g«gTraABfl A CAGAAGQAAG AAACTCftQSA GTAAGCITCT 1020 

108O 
X140 



65 



AGACCCCTOIC AGCTTCiaCA OCCTTCTGCC CTCICKXar TGCCTGCROC CCACCCCAGC 
CIGCTTGrrr TTCCTTTOGC CATAOOAAGG TTTACCafiTA GAATCCTTGC 



TRGerTGATC TGGGCCATAC ATTCXTPTTAA TAAAOCftTTB TGTACATaAG AAAAAAAAAR 1200 



Seq TD HO: C3B DHA fieqaence 

Iluclelc Acid Acceasloa tti Bos seqtience 

Ooding fie<zuencei 52.. 3042 



I " 21 31 41 51 

GCTCACCC^ GAAAAATAT6 CAAJCGTOCC ATTQATATAC MGCCACTAC AATGfiATGGA 60 

OT^AACCTCA GCACCGAOOT TGTCTACAAA AAAGGOCAGQ ATTATaCGTT TOCTTCCCAC 120 

„ GACOGGQOCSV 6AGCCTGCCG GAGCTAjCCQT GTAaSQTTCC TCTeTGGOAA GCCTGTQAGG 180 

/U COCAAACrcav CAGTCACCAT TGACACCAAT CTGAACAjaCA CCATTCTGAA CTIGGAGOAT 240 

AATGTACRGT CATGGAAAC3C TGORGATACC C3TOTTCR1TG CCRGTACTCA TTftCaXKATG 300 

TACCAGGCRG AAGAGTTOCA GGTQCTTCCC TGCAfiATCSCT GCDOCCCCAA CCAGGTCAAA 360 

GTQGCftSSGA AACXSU^TOTA CCTGCACATC GGOCSAGGAGA TAGACGQCGT QGACATGOQG 420 

7S GGCXTCTGAG CQQOAACATC ATAGW5ATGG GGOAGATGGA OCJACAAATOC 48Q 

^^SSSS?^ GAAACCACAT CTGCAATTTC TTTGACTtCXS ATACCTTTGG QGGCCACATC S40 

MGTTTGCTC TOGSAITTAA GGCAQCACAC TTGGAQGGCA CQQAGCTOAA GCftlA!K3QI3A 600 

^QC^CTGG rrGGGTCAOTA CC3CGATTCAC TTOCAOCIGG COQGTOATCT AGAOOAAAfiG 660 

OGJGSPTATQ ACCCAOCCAC ATACATCAGG OACCTCTCCA TCCATCATAC ATTCTCTCOC 720 

8ft TCCATGGCrC CAATGGCTTG TMUTCAAGG ACGTTGTGGQ CTATAACTCT 7B0 

GCTTCITCAC'GGAAGATGGG CCGGAGOAAC GCAACaCTTT TGACCACTGT 840 

^S^^^ TTQTCAAGTC TGOAACCCrrc CTCCCCTOGG AC3GGTCACftG C2RAGATOTGC 900 

AA^TGMCA CAGGAGACrc CTACCCAGGG XACMCCC3CA AGCXJCftOGCR Ai3AC3GCaAT 96D 

GCTGTGTCCA CCTTCTOGAT GGCCAATOC3C AACftACAAOC TCATCAACIG TGCX3GCTGCA 1020 

OGATCTGAGG AAACTOSMT TTtSGrTTATT TTTCACCaOQ TACCAAOGOG COGCTOOSIQ 1000 
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GGAAIGTACT CCCCAGGTTA TTCAGAQCAC ATTCCACTGO OaAAATTCTA TAACAACCX3A 1140 

GCACATTCCIV ACTACCGGt3C TGGCA.TGATC ATASACAACG GAGTCAAAAC CACCGftiSeCC 1200 

TCTGCCAAGG ACAAOCGGCC GTTCCTCTCA ATCATCTCTG CCAGATACAG CCCTCACCRG 1260 

GACX5(X<5ACC CGCTGAAGCC CCXSGOA^SCCG GCCATCATCA GACACTTCAT TGCCTAOAG 1320 

AACCAQGACC ACGGGGCCTG GCTGCX3CGGC GGQOATGTGT «5CTGGACAG CTQCXI3GTTT 1380 

GClGACSUUrCS GCATltaGCCT GACCXTTOGCC AGTGGTGGAA CCTTCCCGTA TGAOGACGGC 1440 

TOCAAGCAA6 ASATAAAGAA QVGCTTGTTT GTTGGOGAGA GTGGCAACCT GGaOACQGAA 1500 

ATOATGGACA ATAQGATCTG GGGCOCTGQC QGCTTGQACC ATAGCX3GAAG GACCCTCCCT 15 6 D 

ATftSOOCAGA ATTTTCCAAT TAGAGGAATT CAGTTATATG ATGGCCCCAT CAACATGCAA 1620 

AACTGCACTT TCCGAAAGTT TGTGGOCCTQ GAGGGOCGGC ACACCAGCGC OCIGGCCTTC 1680 

CX30CTGAATA ATGCCTGGCA OAGCTGCCCC CATAACAACX3 TGACOGGCAT TGCCTTTGAG 17 4 O 

GAOSTTCCGA TIACTTCCAG AGTGTTCTTC GGAGRGCCIG GGCCCTS6TX CAACCAGCTO 1800 

GACATGQATO GGGATAAGAC ATCTOTQTTC CATOAOaTCG A0GGCTCC6T GTGGGAGIAC 1860 

CCTGGCrCCT ACCTCACGAA GAATGACiiAC TGGCTGGTCC 6GCACCCAGA CTGCATCAAT 1920 

GTTCCOQRCT GGAGAGGGGC CATTTaOiGT GGOTGCrTATG CACAGATGTA CATTCAAGCC 1980 

TACAAGAOCA GTAACCTGCG AATGAAGATC ATCAAGAATG ACTTCCCCRG CCACCCTCTT 2040 

TACCTQGAQQ GGGCGCTCAC GAGGAGCACC CATTACCAGC AATACCAACC GGTTGTCACC 2100 

CTOCAGAAGG GCTACACCAT CCACTOGGAC CAGAOSGGCC CCGCCGAACI CGC3CATCTGG 2160 

CTCATC3UICT TCAACAASGG CGACTGGATC GBAGTOGGGC TCTGCTACGC GOGAOGCACC 2220 

2\J ACATTCTCCA TCCTCTCGC3A TGTTCACAAT aSOCTGCTGA AQCAAAOGTC CAAGACGGGC 2280 

GTCTTOTTGA GGACCTTGCA GATGQACAAA GTGGAGCAGA GCTAOOCTGG C3y3GftGCCAC 2340 

TACTTiCTQGG AGGAGGACTC AGGGCTGTTG TTCGtGAAGC TGAAAGCTCA aAACOAOAGA 2400 

GAGAAGTTTG CTXTCTGCXC CATQAAAGGC TGTOAGAGGA TAA&GATTAA AOCTCIGATr 2460 

CCAAAGAACG CAGGCOTCAG TGACTGCACA GCCACAGCTT ACCCCAAGTT CACGGAOIUSG 2520 

GCTSTCGTAQ AOGTGCOGAT GCOCAAGAAG CTCTTTQOTT CTCAGCTGAA AACAAAGGAC 2560 

CATTTCTTGG AGGXGAAGAT GGA(3AJ?rTCC AAGCRGCACT TCTTCCAOCT CTGGAACmc 2640 

TTCGCtTACA TTOAAGTGGA TGGGAAGAA6 !tAiOCCCAGTT CSOAQOATGa CATCCAGGTG 2700 

QTGGTGATTG AOGGGAACCA AGGGCGCaTG OTGAQC^CA. GGAGCTrcaG CSAACTGCAIT 2760 

CTGCaAGGCA TACCATBGCA GCTTTTCAAC TAT6IGGCGA OCATCCCTGA CAATTCCATA 2820 

GTGCTTATGG CATCAAAGOG AA3ATACGTC TGCAQAGGCC CATGGACGAG AGTGGTGGAA 2880 

AAGCTTGGGG CAGACAQGGG TCTCAAGTTG AAAGftGCAAA TtSSCAITCGT TGOCTTCAAA 2940 

GGOISCTTCC GGCCCATCTG GGTGACSurrTG GACACTGAGG ATCACAAAGC CAAAATCXTC 30OO 

CAAGrrGTGC CXATCCCTGT GGTGAAGAAG AAGAAGTTGT GAGGACAaCT GCG3CC066T 3060 

GCCACCTCGT GGTAGACTAT GAOSOTa&CT CTTGOCAGCA GACCAGTOGG GGATGGCfGG 3120 

GTCCXrCAGC CCCTQCCAGC AGCTGCXTPGG GAAGGC0GT6 TTTCAGOCCT GATGGGCCAA 3180 

GGGAAGGCTA TCAGAGAOOC TGGTGCTQCC ACCTGCCCCT ACTCAAGTGT CTAOCTGSAG 3240 

CCOCTGGGGC GGTQCTOGCC AATGCTGGAA ACATTCACXT TCCTGCAGCC TCTTGGGTGC 3300 

TTCTCTCXTTA TCTGTGCCTC rm^ACSTGaOG GTTTt^GGGAC CATATCAGGA GACCTGGGXT 3360 

^ GTGCTGAiCAG CAAAGATCCA CTTTOGCAGG AGGOCrGACC C»BClAiSGJia QVmaTCTGGA 3420 

*IU GGGCTGGTCA TTCACAGA'XC CCCATQOTCT TCAGCAGACA A5TGAGGGTG GTAAATGTAG 3480 

GAGAAAGAGC CTTGGCCTTA AGGAAATCTT TACTCCTGTA AQCAAOAQCC AACX:TCACAG 3540 

GATZAGQAGC TGGGGTAGAA CTCaGCTATCX: TTGGGGAAGA GGCAAGCCCT GOCTCTGGCG 3600 

GTGICCACCT TTCAGGAGAC TTTGAGTOGC JUSGTTIGGAC TTGSACXAGA TGACTCTCftA 3660 

AGQCX:CTTTT AGTTCXGAGA TTCXB^QAAAT CTGCTGCATT TCACATGSTA CCTGGAAOCC 3720 

4-^ AACaUSTTCAT GGATATOCAC TGATATCCAT GATGCIGGGT GCCCCAGOGC ACACGOGATG 3780 

GAGAGGTGAG AACIAATGCC TAGCTTGAGG GGTCTGCAGT CCAGTAGGGC AQGCAGTCAG 3840 

GTGCA!C6rGC ACXGCAATGC CAGG1X3GAGA AATCACSUSAG AGGTAAAATG GAOGOCAGTG 3900 

CXarTTCAGA GGGGAGGCTC AGGAAGGCTT CTTGCTTACA GGAATGAAGG ClGGGkSGCaT 3960 

3TTGCTQQQG GOAGATGAGG CAGCCTCXGG AATGGCTCA6 QQATTCAGCC CTCCCTGCOG 4020 

CTGCCTGCTC AftGCrGGTGA GTACGQGGTC GCCCTTTGCT GAOSTCTCrC TGCOCCACTC 4080 

ATGATGGftGA AGTGTGOTCA GAOGGGAGCA ATGGGCTXTG CTQCTTATQA GCACAGAGGA 4140 

ATTCAGTCCC;; CAGGCAGCCC TQCCTCTQAC TCCAAGAGGG TGAAGTCCAC AGAAGTGASC 4200 

rrcXTGOCTTA GGGCCTCATT TGCTCTTCAT CCAGGGAACT GAOCACSUQaG GGQCTOCAGG 4260 

AGAO^CTRflA TGTGCTCSTA CTCOCTCCSQC CTGQGATTTC AGftGCTGGRA ATATAGAA3\A 4320 

TATCXAGOC3C AAAGCCTTCA TTTTAACAGA TGOOSAAAGT 6AGGCCCCAA GATGGGAAAQ 4380 

AACCAGACAiG CTARGGGAGG GOCTGGGflAO CCCCRCCCTA GCCCTTGCTO CCACACCAC3^ 4440 

TTGCCTCRAC AACEGOCCOC AGAGTGCCCA GGCRCTCCTG AGQTAQCrTC TGGAAATCGG 4500 

GAGAASTCC3C CTOGAAGGAA AGGAAATGAC TAOAGTAOAA TGACAGCTAG CAGATCTCTT 4560 

QCCTO CTGC T OOCAGCGCAC ACAAAOCOGC CCTCOOCTTG GTGTTGGCGG TCCCrrGTGQC 4630 

CrrCACTTTG TTCACTACCr GTCAGCOCRO CCTGGGTGOt CASTAGCTGC AACTOCCCAT 4680 

TGaTGCTACC TGGCTCTCCT GTCTCTGCSiG CTCTACAGGT GfiGGCCCAGC AGAGGGRGTA 4740 

GGGCTOSCCA TeTTTCTGGT GftGCCAATTT GGCTGATCTT QaSTGTCTGA ACAGCTATTG 4800 

GGICCACCCC AGTCCCTTTC AGCTQCTGCT TAATQCXX7TG CTCTCTCCCT QGCXICACCrr 4860 

ATAGAGAGCC CRAAG2W3CTC CTCTAAGAGG GAGAACTCTA TCTGTGGTTT ATAATPCTTGC 4920 

0*> AOGAGGCACC AQAQTCrOOC TGGGTCITQT GATGAACTAC ATTTATCCOC TTTCCTGCCC 4980 

CRACCACAAA CTCTTTCCTr CAAAGAQGGC CTGCCTOGCT CCXTTCCACCC AACTQCACCC 5040 

ATGAGACTGG GTCC;!AAGAGT CCATICCCCA GGlxaQaAaCC AACTGTCa«5G GftGGTCTTTC 5100 

CCAC^AACA TCTTTCRGCT GCfTGQGAGGT GACCATAGGG CXCTQCTTTT AAAQATATGG 5160 

CTGCTTCAAA OQCCAGAOTC ACAQGAAOOA CTTCTTCCSiG GGAGATTAGT GGTGATGGftG 5220 

AGGAGACTTA AAATGACCTC ATGTCCTTCT TGTCCACGQT TTraTPGAGr TTTCACTCTT 5280 

CTAMQCAag GGTCTCACnC TGTOAACCAC TTAGGATGTG ATCACXTTCA GGaBGCdAQG 5340 

AATGTTGAAT GTCTTIGGCT CAGTTCATTT AAAAAAGATA TCTATTTGAA AGTTCTCRGA 5400 

GTTGXACATA TGTTTCACftG TACAGGATCT GTACATAAAA GTTTCTTTCC TAAACCATTC 5460 

ACCAAGAGCC AATATCTAGG CATTTTC3TG GTAGCACAAA TTTTCTTATT GCTTAGAAAA 5520 

TTGrCCTQgr TGTTAT TTCT GTTTOTAAGA CTTAAGTGAG TTRGGTCTTT AAGGAAAGCA 5580 

ACGCTOCTCr GAAATGCTTG TCTTTTTTCT GXTGCGGAAA 13U3CTGGTOC TTTTTCGGGA 5640 

BTTAQ ATOra. TRCASTBTTT GTATI?rAAAC ATTTCTTGTA GGC^AXCAGCA TGAACAAAOA 57O0 

TATAITTTCT ATTTATTTAT TATA1GTGCA CTTCAAGAA6 ICACTQTCAG AGAAATAAAG S760 

AATTGTCTTA AKTGTCAAAA AAAAAAAARA AAAAAAAAAA AAAAAAAA 5808 
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1 XI 21 31 41 51 

I ( I I 1 i 

GTOGCXTTCGA GGTGGTGGCA GQGCGQCCCC CTGCAGTCCG GAOACGAAGG CACGQACOSG €0 

GCCrCGGGAG GCAGGTTOOG CTGCAAOGAA COOCTCraaC TTOGTOCTAC ACTTGOSCRR 120 

J ATGTCTCCGA GCTTACTCSlC ATAGCATATT GGTATATCAA AA-TGAAATGC AAGGAACCAA 180 

AAATAACRTA ATTGAAOGCA GrAAAAGTGA AATTAAATAG (3AAGATCATC AGTCAAGGRA 240 

GACCCACTGG AC3AGGACAGA AAATQAAGCA GTGTTrTATC A1K3TQTATTT CftGC3VG6TCT 300 

TCTTGAAATT TAACTAAAAA TATGACtCGCT CTCTCTTCAG AGAACTGCTC TTTXCAOTAC 360 

CA6TTAG9TC AAACAAACXIA GCCCCTAGAC GTTAflCTATC TBCTATTCTT GATCATACTT 420 

lU GC3GAAAATAT TATTAAATAT CCTTACACTA GG7ATGAGAA GAAAAAACRC CTGTCAAAAT 480 

TTTATGGAAT ATTTTTGCAT TTCACTAQCa TTCX3TTCATC TTTTACTTTT GGTAAACATT 540 

TCCATTATAT T6TATTTCAG GGATTTTGIA CTTTTAAGCA ITAGGTTCAC TAAATACCAC 600 

ATCTTOCTAT TTACTCZkAAT TATTTOCTTT ACTTATOGCT TTTTGCMTA TCCAGTrTTC 660 

CTGACAGCTT GTATAGATTA TTGCCICAAI TTCTCTAAAA CAACCAAGCT TTCATTTAAG 720 

TOTCAAAAAT TATTTTATTT CTTTACAQTA ATTTTAATTT GGATTTCAGT CCTTGCTXAT 7Q0 

GTTTT^GRG ACCCAGCCAT CTACCAAAfiC CT6AAGGCAC AGAATQCTTA TTCTCGTCAC B40 

TQTCCTTTCT ATGTCAGCAT TCAGAlQTTAC TGOCTGTCAT TTTTCATGST GAIGArTTTA 000 

TTTGTAGCZT TCATAACCTG TT6GGAAGAA 6TTACTACIT TOfiTACAfaOC TATCAGQATA 960 

ACITCCTATA TGAATGAAAC TATCTTATAT TTTOCTTTTT CATCCCACTC CACTTATACT 1020 

CTGAGATCTTA AAAAAATATT CTTATCCS^ CTCATTOTCT QTTTTCTC3WJ TACCTGGTTA 2090 

CCATTTGTAC TACTTCAGGr AATCATTGTT TTACTTAAAG TTCAGATTCC AGCRTATATT 1X40 

GAGA TGAATA TTCCCTGGTT ATACTTTGTC AATAGTTTTC TGKTTaCTAC AGTCTATTCG 1200 

TTTAATT6TC ACAAGCTTAA TTTAAAAOAC ATTGGATTAC CTTTGGATCC AXlTTGrCAAC 1260 

TG6AAG1GCT GCTTCATTCC ACTTACAATT CXTTAATCTTG AGCAAATTGA AAAGCCTATA 1320 

ZD TCAATAATGA TTTGTTAATA TTATTAATTA AAAGTTACAG CTGTCATAAG ATCATAATTT 1360 

TATGAACAGA AAfiiUkCTCAQ GACATATTAA AAAATAAACT GAACTAAAAC AACTTTTaCC 1440 

GCCTGACIQA TAQCATTTCA GAAIGrGTCT TTTOAAOGOC TATAGCAGTT ATTAAATAGT 1500 

OTTTTATTTT AA AAACA ftAA TAATTCC3UU3 AAGTTTTTAT AGTTATTCAS GGACSRCTATA 1560 

XTACAAATAT TACITTQTTA TTAACACAAA AAGlGAIAAiS AGTTAACA1T TGOCIATACT 1630 

GAT6TTTGTG TTACDCAAAA. AAACTACTGa ATGOUIACIG TTATGTAAAT CTGAGATTTC 1680 

ACrGftOlACT TTAAGATATC AAOCTAAACA TTTTTATTAA AT&ITCAAAT OTAAGCAAGA 1740 

AAAAAAAAA 174 9 
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Seq XD MOs O&O DMA S«qiieiice 
3^ HudelG Acad AccQselon #s BC013089 
Coding seguence: 1..2571 

1 11 21 31 41 SI 

1 I I I 1 i 

ATHGCCCTOG TACTCGGCTC CCTOTTGCTG CTGGGGCTGT GCGGGAACTC CTTTTCAaOA 60 

GGGCAGCCTT CATCCACAGA TGCTCCXAA6 GCTTGQAATT ATQAATTQCC TGCAACRAAT 120 

TATGAGACCC AftGACTOCCA XAAAQCXGOA CCXaTTGGCA TTCTCTTSGA ACTAQTGCAT IBO 

ATCTETCTCT ATQTGCTCACA. i3CaSCaXGta mrCOQUSAAS ATACTTTGAG AAAAITCIXA 240 

caSAAGGCAT ATGAATCXSUV AATTSATTAT GACAAGATTG TCTACTAKSA AGCMOGATT 3Q0 

ATTCTATCiCT GTGTCCTGGGr GCTGCTQTTT ATTATTCXGA TGCCTCTGGT GGGGlSmTTC 360 

TTTTGTATGT GICCSTTGCTG TAACAAATGT GGTGGi«3AAA TGCACCAGCO AC3W3AAGGAA 420 

AAI00GC30CT TCCiaAGGA& ATGCTTTBCA ATCTQOCTOX TGGTGATTTG TAIAATAATA 480 

AGCATTGGCA fTCnTCCATGQ TTTTGT8GCA AATCAOCSUSO TAAfiRACCGS SATCAAAAOa 540 

A6IGGGAAAC TGGOUSATAG CAATTTCAAG GACTTGGQAA CTCTCTTGAA T8AAACTGCA 600 

DU GAGCAAATCA AATATATATT GGOC3C3WOTAC AACACTAOCA ASGACRASGC GTTCACM3AT 660 

CTGAACflGTA TCAATTCAGT GCTAGGAGGC ©GAATTCTTG ACOQACTGAG ACOCAACATG 720 

ATCCCrrarrrC TTGAIGAGAT TAAGTCCSVIO GCAACAGOGA TCAAGGAGAC CAAnOAGOCa 780 

TTGGAGAACA XGAACAGCAC CIT6MU3AGC TTGCACXSAAC AAAi9FACACA GCTTAjSCAGG B40 

AtSTCTQACCA GOCSTGAAAAC ^EAOCCTGCOG TCATCTCTCA ATGACGCTCT GI6CTTGGTG 900 

CKTCCATCAA 6TGAAAOCTG CAACAGCATC AGATTCTCTC TAAGCCAGCT GRATAGCRAC 960 

OCTGAACTOA OTCAGCTTCC AOCGGrGGAO! GCAGAACTTG ACAAOGTTAA TAAjOGTTCTT 1020 

AGGACAGATT TGQATGQCCT G6ICCAACAG GGCTATCAAT CXXTTAATGA TATAOCTGAC lOaO 

AGASTACAAC GOCAAACGAG GACXBTanA OCAGGTATCA TUttlSGGTCrT GAAXTOC!BTT 1X40 

GGTTCAGATA TCGACAATGT AACTCAGGGT CIICCTATTC AGQATATACT CTCAGCATTC 1200 

TCTGTITATG TTAATAACAC TGM^A0TTAC ATCC3UZAQAA ATTTACCTAC ATTGGAAGAG 1260 

TA3X3ATTCAT ACTGGTGGCX QOQTGGCCTG GTCATCTGCT CTCTGCXGAC CCTCaTOtTrG 1320 

ATTTTTTACr ACCTGGGCTT ACTGTGMGC GTGTOOaGCr ATOACBUSGCA TGCKAOCCOG 1380 

ACCACCOGAS GClGTGTCrC CAACM3QC3GA O80GTCITCC TCA3G6TTGG AeTTQOATTA 1440 

lusTTTCxnrcr itcgctggat atigatgatc attgtostic ttacctttgt ctttggtgca isoo 

AA^rGTOOAAA AACTGATCTG TGAACCITAC AOGAGCAAGG AATTATTCCB GGTTTTGGAT 15S0 

ACAOOCTACT TACTAAATGA AGACTGGGAA TACTATCICT CTQGaAAQCT ATTTAATAAA 1620 

irCAAAAATGA AGCTCACTTT XXoAACAAGTT TACAGTGACT GCAAAAAAAA TAGAGGCACT 16Q0 

TAOGGCACIC TTCAGCXGCSL GAAGAGCTTG AAXKrCMSm AAiC3LTCTCAA CATTAA1GAG 1740 

CATACIGGAA GGATAAGCRG TQUCTTGGAA AGTCTGAABG TAAATCTTRA •KATCTTTCTO IBOO 

/U TTGGGTGCAG CAGGAAOAAA AAACCTTCAG SATTTTGCTO CTTGTGGAAT AGACAGAATG 1660 

AATXATGACA GCTACTTGGC TCAGACTGGT AAATCCCCOG CftGGAGTGAA TCTTTTATCA 1920 
TTTOCATATG ATCTAISAAGC AAAAGCAAJW: AQTTTOC3C3CC CTGGAAATTT GflGGAACTCC ' 1980 

CTGAAAAGAG ATGCACAAAC TATXAAAAGA AO^CACChGG AAOSAGTCCr TCCIAXABAA 2040 

CAATCACTGA GCACTCTATA OCSlAAGOGTC AAjSATACrTG AAOGCACAGG GAATGGATTG 2100 

TTQGAGAQAG TAACTAQGAT TCTAGCTTCT CTGGATTTTG CXCSVSAACTr C3VTCACAAAC 2160 

AATACTTCCT CTGTTATTAr TG3U3GAAACT AAGAAGTATG GGAGAACAAT AATAGGATAT 2220 

TTTGAACATr ATCTGC3W3TG GATCGAGTTC TCTATCAGXG AfiiUU\GTOGC ATC3GTGCAAA 2280 

OCTSTGGCCA C0GCTCTftC3A TACTGCTGTT QATSlCTtTC TGTGTAGCTA CATTATOGAC 2340 

CSCXZTTOAATT TGTTTTGGTr TGGCATAGGA AAAGCIACTG TATTTTXACT TCZCOGCTCTA 2400 

ATTTTTGCGG TAAAACTGGC TAAGTACTAT OGTCX3AATGG ATTCPGAGGA CGTGXAOGAT 2460 

GATGTTGAAA CTAl^ACCCAT OAAAAATATG GAAAATQOTA ATAATGGTXA TCATAAAGAT 2520 

CATGTATATG GTATTCACAA TOCXGTTATO ACAAGOCCAT CACAACATTG A 2S71 

Seq TD UOs C41 DHA fieq^eiice 
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Nucleic Acid Accession #7 1IM_033049 
Coding sequence : 28 1566 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



1 
1 

CCACGCGTCC 
CTCCTTTCTG 
GAAACTBGCA 
GAAACTGCTA 
CCCATAATTA 
ACACRTAGTr 
GTAAATTCAT 

CAATCATCAG 
GGTCCCWtSCA 
CA.TAATACAA 
AAAGGAAAG6 
GAGAAftCATT 
GTATTTOGQV 
CC3iAGATCTG 
GCAGAAACCA 
AQTAC3CTCAA 
AACX3M3ACTG 
AGQCCTAACC 
AACGCACAQC 
<3TOGGGGaCT 
GGACTGGACT 
ATlGTCATTC 
AAGCATATTG 
ACftSGOTTCR 
TCX»GA6ACA 
TATTAfiAATC 
TTAGA8TGIT 
TPmCTTCC 
AATOCAGCTC 
CTTGTCAQAQ 
TTTTCTTAAT 
COCAGCATTB 
G(»GACftTAC 
CAAOGGTCAC 
ATC9QAGASAT 
AGftClGGAAA 

TTTT'rGAAAT 
AGAGASSCAA 
CTTCTTC3CET 
GAA6ACTGGC 
TXTGAAATTQ 

ACAAATGGG6 
AITICIAASfi 
IGTTAATTGT 
AAAAAAA 



11 
1 

GAGCAAGAAC 
TAAACACAGC 
CTAGTGGTCC 
GCACCACAGC 
GTACACATAQ 
CCTCCACAAT 
TAGCTAjCCTC 
CTTCTGAAAC 
GGCCTCCCAC 
ATCCTTGOCA 
GTTTTTGCCT 
TATTCCCTGG 
CCATGGCCTA 
CATCTGTTTA 
AAATGOBTGC 
CAAGTGACAA 
GCAACTTTCT 
CQQATGACTG 
CACAGA6CCC 
ACa\A£3CAATG 
AGCAQGAAGA 
GTAAQOACAA 
TCAGCATGAT 
AAGAASA6AA 
CCAATCTTGG 
GCCAGATGC3i 
ATAA<3AATGr 
TAGAAAOACr 
ATCTGACATC 
ACTTGCTAAA 
AGGTQGrTTT 
TCartXTCCTG 

GCATCAACTQ 
AAGCrCACTC 
TCtGGCAACC 
QAGACATATC 
TAATTAGAAO 
GCTAOGAGGC 
AGGAAACTGG 
AGGTCCCCTC 
TGGQAOAAQG 
GM3ACAAACI 
AGCCICOGiGG 
AGGCTCAGCT 
GCAACCACAT 

GAAIGGTAAT 



21 
I 

ASCTAAAATG 
CAOCAACCAA 
TACAGTAOCT 
AAATACACCT 
TTCCTCCAEA 
TCCTATACCT 
TGACATAATC 
AC2UUUSZftAC 
TQGCACCGCT 
AGATG2VrCCC 
GTGTTTAQAA 
GAAGATTTCA 
TCAAOACTTQ 
TGGACAGACT 
TGATGACAAG 
TGAGAAGACT 
AAACTATGAT 
CCTCAATGGT 
TTTCTOG3TT 
CTTAATAAAG 
TGCTAATOGQ 
ATTXCAOCTG 
AATTGCATaXS 
CTTOATTGAC 
ASCAGAAGGG 
AAATCCXTAT 
GGAACCOGCC 
QATGGAGAAG 
TGCCAGCCTC 
TAAGAATCTA 
CTTCAAnCAG 
GTA£7GGCAAC 
GGC»^ACA1T 
TGGAGGTOCG 
TCTGAC3\AGT 
TPTGAACAGC 
TCTCAGCFTT 

AGAAGGGGCA 
GTOGOSAGGA 
CTdCATCaGC 
TTTAAAAAGA 
67CTGGCAAA 
TGTAGGTTTC 
GTGGCC2VGCA 
CCAGTACAAG 
TTAAAAGTTA 
I3CAATAAAGT 



31 
I 

AAAGCCATCA 
OGCAACTCAG 
GCJU3CT6ATA 
TCTTTCCCAA 
ATTCCTACAC 
ACIGCTGCAG 
ACOGCTTCAT 
AATGAAAIGT 
TTATTG6AGA 
trGTGCTVGATA 
GGGTATTACT 
GTGACAGTAT 
CATAGT6AAA 
6TAATTCXTA 
TTTGTTAATG 
GTGACTGAQA 
TTGACCCTTC 
TTAGCATGCQ 
GCTTCCAGTC 
AAGA6T0GT6 
AACrOCCAAA 
ATCCTCACTA 
ATTGTCACAG 
GAAGACTTTC 
AGCGTCTTTC 
TOVAGACACA 
ATGGCCCCCA 
TGAGCACCAG 
TCTOAATGrOA 
TGACATTAAA 
TACAAAQTAC 
AAGAACCATT 
GCTCTTGAQT 
AGGGGATGAG 
CAQAATAGGG 
OCAGAGCTTG 
TTCAGQAGGC 
AAAAIGTCCA 
G3U3LAOTAAAA 
TCAATTAGAG 
AAAGGAGCAC 
AAAAATOCAG 
□GGTGCSAGA 
TGiWSGTGTOC 
ACACACAACC 
CTTTTACAAA 
TTTTASnTTGT 
GCCTTTGTXA 



41 



ttcatcttajc 

CTGATGCTGT 
CCACTGAJUVC 
CAOCTACTTC 
CTGCTCCCCC 
ACAGTGAGTC 
CTCCAAATGA 
CCCCCACCAC 
CCAGCMCCr 
ATTCQTTATQ 
ACAACTCTTC 
CAQAAACATT 
TTACTAGCTT 
CTGTAAOCAC 
TAACAATAGT 
AAATTAATAA 
GGXGTGATTA 
imxaCAAATC 
TCAAGTGTCC 
OQGOCCCZGA 
AGTGTGCATT 
TTGTGGGCAC 
aVAGATCAAA 
AAAATCXAAA 
CTAAGGTCA6 
GCAGCATGOC 
ACCAATQTAC 
TAAAGATCTG 
AGTTGTGftAT 
TGTAGTAGAT 
TGAGACAATG 
TOCAATCTAS 
TAAGTGACCT 
AAGGGA^TACC 
ACACTGCTTC 
C^AOCTAGCC 
GTGCCTGGGA 
CrATGGGGlG 
AACATGACCT 
AGOAOGCACC: 
TTCTCTAATC 
GAGTAAGAGC 
GGGAGCTTGT 
CATTQGQQOC 
ACACACACAC 
a:GTTA1i:TAGT 
TATTAITATT 
GATOGrGAhA 



51 
I 

TCTTCXTGCT 
AACAACCACA 
TAATTTCCCT 
ACCTGCTCCC 
CATAATTAGT 
AACCACAAAT 
TGGATTAATC 
AQAAGACAAT 
AAACAGCACA 
TQTTAAGCTG 
rACATGTAAfi 
TQACKCAGAA 

GrrrAAAC3A!r 

ATCrrCTGTCA 

AACAA^nra 

AGCAATTAGA 

TTATGGCTGT 
TGACCTGCAA 
TGATGOCTGC 
QTOTOCGTGC 
TGGCTACS^lr 
CSVrCQCTGGC 
TAACAAAAOG 
ACIGGGGTCG 
GATAAC6GCC 
COGOCCTGAC 
AAGCTATTAT 
GCCTCGGQQQ 
GTTT6CAAGG 
GCTATTAGCa 
GTTAGGGTTG 
AGGAAAQCTC 
AATTOCCCTG 
C3«:CACCTTT 
TATCXXrrCCA 
TCACCXZAAGA 
ATOCAGGAAC 
CACTCTACAO 
GGTAGAAOGA 
TGGGATCCAC 
ATGCCCTCCC 
CTTAGGTiT^G 
GCTCAGGAGT 

AACCACAC&C 

gtccsttttt 
t6ttcitgac 
AAAftAAAAAA 



9eq XD VtOs C42 IMA Sequence 

MUcsleic Acid Accessioa #> NM_p01432.1 

Coddng sequence f a€7-.67fi 



X 

I 

TCACTTGCCr 
TGACAGCGGC 
tooscaocqg 
GAGGATGGA6 
TCCTCTAC3U3 
TOATAACTGC 
AACAAASTGT 
GOaCATGAGT 
CTTCTTTTTA 
TATTATTTTG 
AAATCGGAAAA 
GTTGCC3GGAA 
GAATGQCGCC 
CATTTTRTTA 
GTACITQAAA 
TGTGCAGAAA 
GCXAAKiaCT 
AOCAATTOOC 
ATATATAGTC 
GGTTTGACTG 
TAATGOWTT 
AAACAAGCIA 
TATATGA0GG 



11 
1 

GATATTIC^ 
TCTCdUKTCA 

ATOCTCTGTG 
GCAGTGCTCA 
AC3U3CTTTAIG 



C3VAAACTACT 
AOOGTCCMC 
TTTCTTATCA 
AGTAAAGAAC 
GTCIGAATAT 
ATCAAACTTA 
ATAATATTTA 
AATOTTTTTA 
ATATTTAATA 
TCATTSAAAlS 
AOGTCATAGC 
TiAATCGATTT 
TACCATGGTT 
AGAATTTGGG 
AC3CC TBGGGA 
GCAGTOGACA 



21 
1 

GTGTCAGAOG 
CTGCCGGGAfi 
ASCCCCA0CT 
COGGCAGOGtr 
6TACAACTGT 
TTCAGACAGA 
XGAATGOCTA 
GCAGGTGTGA 
AACCT'i'TAAG 
CAGTC5GTa3G 
CAAAGAAGGA 
G&GAGAGTTA 
TGGOCAGGGA 
TOTTGGGTCA 
TTTTTGTTTT 
TC2^AAAGAAA 



31 
I 

GACACAGOCA 
COCSTCTGCr 

CCCTGCdCTO 
GATTCCATCA 

TrGTTTGCAT 
AQTGGGTTAT 
OAAGAGTAX 
TTCCACATAT 
ATAimGAGA 



TCAAGAATTG 
AGTAAtSTATG 
TGATATGTAG 
AGAAflClUIAT 
GTCTATGGTC 
GTTOCCXATa 



TAAC2)GTGT6 
AGTGTTAGGT 
A-rtTITGACA 
ATTGATATTT 
TiVTATGCCTG 
TTAGCAAATG 
TTTOCTATGT 
TTGGCACCAT 
ATAGGTCCIG 



41 

1 

ACJSTGGGGTC 
OCOGCOCTGC 
TCX3COGATGA 
CTGCTCTGCC 
TGTATCOCRG 
CSQTOTGGCTC 
GGACAGTGCA 
ACT13GT3TCC 

Gr6ac?rTTQA 

TATTTCTGCA 
GTTACCTCAO 
TCCAGAGTT6 
GCIGG^AAT 
CAATAACACT 
GACTATTTQC 
TTATACAAGT 
GrGOU^AGTG 
ACABATTXCT 
TOCTCAAATC 
GQTATCaTAT 
TGTTAAAChC 



GCAACTCAOG ACTC!CTACftG 



51 
I 

OCITCI3U3GC 
OCGTGCACTC 
CCGOGGGGAa 
TOGGTTTCCA 
GAfiAGi:GC!niG 
AAGTGTCAAT 
TCEATCrOGr 
GATGTGAACA 
CCGrrOATTCT 
GATGGTACAG 
OaOATCXTLGA 
OCGGAAGTCr 
ATTAATATTC 
GrATTTTAAT 
TAATGTATAA 
AATTTCXrrGA 
CirrAGAAGTA 
GTRAGCCTAT 
AGTGATAATX 
ATTAAAACAA 
TACAOUTTTG 
mAATTCTOT 
GTIU^A&ICA 



60 
12 0 

leo 

240 
3 00 
360 
420 
480 
540 
«00 
660 
720 

vao 

840 

900 

9€0 

1020 

10 60 

1140 

1200 

ia£o 

1320 
1380 
1440 
1500 
1560 
1620 
1660 
174D 
1800 
18 CO 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
3400 
2460 
2520 

3sao 

2640 
2700 
2760 
2820 
2880 
2887 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

B40 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 
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CtChTCIACC AGATTCTGCC TATtSTAAAAT GAATTtaAAAA ACAATTTTCI 6TAATCTTTT 1440 

ATTTAAGTAG TGGGCATTTC ATAGCTTCAC AATOTTCCTT TlTrTGTATAT TACAACATTI ISOO 

ATGIGAGGTA ATTATTQCTC AACTiGACAflT TAGAAAAAAG TCXMACTTG AAGCCTHAAT 1560 

_ TTGTGCTTTT TAAC3AATATT TXTAGACTAT TTCTTTTTAT AGGGGCTTTG CTGAATTCtA 162D 

J ACATTAAATC ACAlQCXXS^AA ATTTGATGGA CTAATTATTA TTTTAAAATA TATGfiABACA 1680 

ATAATTCTAC ATGTTGTCTr AAGATGGAAA TACAOTTATT TCATCXTTTA TTCAAGGAAG 1740 

TTITAACTTT AATACAGCTC AGTAAATGGC TTCTTCTAGA ATGTAAAQTT ATGTATTTAA IflOO 

AGTTGTATCrr TGACACAOCaA AAT6GGAAAA AACTTAAAAA TTAATATGGT GTATTTTTCC 1860 
AAATGAAAAA TCTCAATTGA AAGCTTTTAA AATQTAGAAA CTTAAACACA CCTTCCTGTG 

l\) GAGGCTGAGA TGAAAACTAQ GGCTCATTTT CCTGACATTT OTTTATTTTT TGQAAGAGAC 1980 

AAAGATTTCr TCTGCACTCT 6AGCCCATAG GTCTCAGAGA GTIAATAGGA GTATTTTTGG 204 Q 

GCTATTGCAT AAGGAGCXZAC TGCTGCCACC ACTTTTGQAT TTTATGGQAG GCTOCTTCAT 2100 

CGAATGCTAA ACCTTTGAGT AGAGTCTCCX2 TGOATCACM! ACCAGGTCAG GGAGQATCTO 2160 

TTCTTCCTCT AOCTTTATCC TGOCATGTGC TAGGGlAAAC GAA<;GCATAA TAAGOOVIGS 2220 

13 CTGAOCTCIG GAGCACCAGG TGCCAGGACT TGTCTOaTG TGXATCCTiTG CATTATATAC 2280 

CClGGraCAA TCACA06ACT <3TCA.TCTARA GTCCTGGCCC TGQCCCTTAC TA'I^AGGAAA 2340 

ATAAACAGIAC AAAAACAAGT AAATAXATAT 6GTCCTATAC ATATOrGtCATA XATATTCATA 2400 

TACAAACATG TATGTATACA TGI^CTTAAT GGATCATAGA ATTGCAqTCA TTTGGTCCTC 2460 

OA "I^TAACCAT TTATATAAAA CTTAAAAACA AGAQAAAAGA AAAATCAATT AGATCTAAAC 2520 

-ZU AQTTATTTCT GTTTCCTATT TAATATAGCT GAAGUCSUVAA TATQTAAGAA CACATTTTAA 2 580 

ATACTCTACI TACAGTTGGC CCTCTETQGT TAGTTCCACA TCTGTGGATT CAACCAACCA 2640 

AGGA OGGA AA ATGCTTAAAA AATAATACAA CAACAACIAAA AAATACATTA TAACAACIAT 2700 

TTACTTTTTT TTTTTTCTXr TTGRGATGGA 6TCTCQCTCT GTTGGCX3U3G TTOGAOIGGA 2760 

GTGGCACQAT CTCGGpICAC TGCAACCTCA CCTOCGGGGT l-CAAOAGATC CTCCTGCCTC 3B2D 

AGCCTCCTGA GCAQCTQGGA CTACAOGCGC ATGCXZACCAT GCCCAGCTAA TTTTTGTATT 2880 

TTTAQTAQAG GCGGGGTTTC ACCATaTTGG CCAGGATGGT CTCAATCTC3C TAACCTTGAG 2940 

ATCCACCCTC CACAGOCrC3C CAAACTGCTG OGATTAC3\Ga CGT8AGCCAC GGCAlGGTAGC 3000 

ATOrrACATTA <^ATTACAA GTAATGTAAA OAiaASTTAA GXATACAGGA QOAlXnGAAT 3060 

AQGTTATATG CAAGCACTAT GOCXTTTTTAT AXAAGIGACT TQftACATCTG TGOOGGATTT 3120 

jKI TAGTATGTGC A0GGGGGCGA TCIGGGAATC AaTCX:CCTGT GGATACCAAG GTACAAJOTOT 31B0 

ATTTATTAAC GCTTTACTAGA TQTOAC3GAGA GTCTGAATAT TTTCAGTGAT CTTGGCTGTT 3240 

TCAAAAAAAT CTATTGACTT TTCAATAAAT CASCTGCAAT OCATTTATTT C^LTTTACZAAA 3300 

AQATTTATTG TAAGCCTCTC AATCTTGGTT TTXCAGTTOA TCTTAAGCAT GTCAAXTCAT 3360 

AAAAACAAGT CATTTTTffTA rTTrrCATCT TTAAQAATGC TTAAAAAAGC TAATGCCTAA 3420 

AATAGTTAGA TCTTTGTAAA TGCATATTAA ATAATAAAQT ATGACCCACA TTACTTTTTA 3 4 BO 

T SGG TGAAAA TAAQACAAAA AXAATAGrTT TAQTGAGGAT GGTGCIGAGT AAACATAAAA 3540 

ACTOATTTGC TCTCAGCIOA TGTGTCCTGT ACACAGTGGG AAGATTTTAG TTCACACTTA 3600 

GTCTAACTCC CXKaLTTTTAC AGATTTCTCA CTATATATAT TTCIASAAGQ GGCTATGCAT 3660 

ATTCAATGTA TTGAGAACCA AAGCAACXM AAAXGCATAA A3GCATAATT TATGGTCIXC 3720 

*frU AAGCAASGCC: ACATAATAAC OCAGTTAACT TACTCTTTAA CCAGGAATAT TAAOTTCTAT 3780 

AACTAOZACr CAAG6TTTAA CXZTTAAAATT AAGATTTCCT TAACCTTAAC CTTAAAATTO 3840 

ATATTATATT AAACATACAT AAXACAATGT AACTCCACTG TTCTCCTGAA TATXTTTTGC 3900 

TCTAATCTCT CTGCCGAAAQ TCAAAGTGAT GGGAGAATTS QTATACTG6T ATGACXAOGT 3960 

CTTAAGTCSwa ATTTTTATTX ATGAQTCTTT GAGACTRAAT TCAATCRCCA CC»BGTRTCA 4020 

AAT CAACr XT XAXGCASCAA ATATAXGATT CTAGTGTCTG ACTTTTGTXA AAXTCAGTAA 4080 

XGCAGnTTT AAAAACCTGT ATCIGAOCCA CTTTGXAAXT TXTGCTOCaA XAXCCATTCX 4140 

QTMSRCITTT GAAAAAAAAG XITXXAAITT GATGCCCAAT AXAXTCTGAC CGTTAAAAAA 4200 

TTCITOTXCA XATGQGAGAA OcaOGQaG^TOA XGACITGTAC AAACAISTATT TCTGGTGXAX 4260 

ATTT XAATGT XXXTAAAAAG AGTAATXXCA TTTAAATATC TCTTATTCRA AXXTGATSAT 4320 

GTTAAATGXA ATATAATGXA IXTICXTXTT ATXTTGCACT CrfiXAATTOC ACTTTXTAAG 4360 

XTTQAaOAGC CATX TXGaTA AAOQGmXT ATTAAAJOAXG CTATGGAACA TAAAGTT3IA 4440 

TTGCATGCAA XTTAAAGTAA CTXATTTGAC TATGAAXATX AXOSGATTAC TGAATTGTAT 4500 

CAAXTXGTTT GTGT TCAATA TCAGCTTXGA TAAtTOTOrA CCTTAAGAXA XTGAAGGAOA 4560 

AAATAGATAA TXTACAAGAT AXTAXTAATI TTTATTIAXT TTXCXTGGQA ASTGAAAAAA 1620 

AXT6AAATAA ATAAAAAXGC AITGAACATC XTGCAXXCAA AATCTTCACT 6AC 4G73 

Se? U) NO: C43 DBA Seiiuence 
IJuclelc Acid Accossion #i AF01146B«1 
OU GadLng seqauuce: 257.. 1468 

1 11 21 31 41 SI 

1 I I I I 1 

^ _ OGAAGACTTG GGTCCtlGGQ TCOCAGGTGG GAGCCGAOOG GXGGQTAGAC 06XGG0SGAT 60 

DD ATCTCA6TGG OGQACSQAOGA OGGOGOGGAC AAGGGGOGGC TGGXOGGAGT GGGGGAGOGT 120 

CftA GTCOQ CP GIOGGTXCCT OCGTCCCTGA GTGTGCITQQ OGCTGCCTTG XGCCOGCCCA 180 

GCQOCTTTGC ATC3CGCTCCP GGGCaCOQM GCGCCCTCTA GGATACTQCT TQTTACTTAT 240 

TACAflCTAGA GGCATCATGa AOOGAXCTAA AGAAAACTGC ATTTCAQGAC CIGTTRAGGC 300 

TACAGCXCO^ GTTGGAGGTC CAAAAOGTQT XCTC33TGACT CAGCAAAXTC CTT3TCAGAA 360 

f\) TCCAXTACCT GCAAATAGTG GOCaGGCTCA GCQQGTCTTG T G TOCTTCAA A' ri ' U ' lTUJ CA 420 

GCBOGTTQCT TT6CAAGCAC AAAAOCTTGX CXCCAGTCAC AAGOOaGXTC AGAATCAGAA 4BD 

GCAfiAAGCAA XxeCAGGCAA CCAGTGXACC XCATCCTGTC XCCA(S5CCAC XGAATAACAC 540 

CCAAAAORGC AAGCAGCOCC TGCXATCGGC ACCTGAAAAT AAXCCTGAGG AGGAACTGGC 600 

AXCAAAACAG AAAAAXGAAG AATCAAAAAA QAGOCAGTGG GCXTTGGAAG ACIXIGAAAT 660 

XG GTOacC Cr CXGGGXAAAjG OAAAGXTTOG TAATGTTTAX XXGGCAAQAG AAAAGCAAAG 720 

CAAOTTTAIT CXGGCXCTTA AAGIGTTAXT TAAAGCTCAG CTGGAGAA2US CXXIGAOTOOA 78D 

GCATCAGCTC AGAAGAGAA6 TAQAAATACA GTOCCACCXI CX3GCATCCTA AXAXTCTTAG 640 

ACTGTATGGT XAXIICCATG ATGCTACCAfi AGTCTACCTA ATTCTGSAAT AXGCACCACT 900 

XC3GAACAGTT TAXACAGAAC XTCAOAAACT XTCAAAGTXX GAIGAGCAOA GAACTGCTAC 960 

OU TXATATAACA GAAXIGGCAA ATOCCCTGIC TTACTGXCAX TOOAAGAGftS XXA3rCCATAG 1020 

AQACKITAAG OCAGAQAACT TAClTCTTOa ATCAOCTGGA GAGCTXAAAA XTGCAOATTT 1080 

TGGQTG^rCTL GTACATGCTC CATCTXCXSUS GAGGACXACT CTCT6XQGCA CCX^IGGACTA 1140 

CCT GCCCOCT GAAATGATTG AAQG TOGGAT GCATGA!K3AO AAGGXGGATC ICXYSGAGCCT 1200 

TGraiGTTCTT XCSCrAXQAAT TTTXAGTTGG GAAGCCTCCT XXXGAGGCAA ACACAXACCA 1260 
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ACAGACCrrAC AAAAGAATAT CACt5GC3TTQA ATTCACATTC CCTGACTTTG TAACAGAGGG 1320 

AGCCAeaSAC CTCATTTCAA GACTGTTGAA GCATAATCCX: AGCCA6AGGC CAAIGCTCAG 1380 

AGRflOTACTT GAACAOCCCT GGATCACAGC AAATTCR.TCA AAAOO^TCAft AlnrOCCftAAA 1440 

GAAflGAATCA GCTAGCARAC AOTCTTAGGA ATGGTGCAiGO OGGAGAAATC CTTGAGC3CAO 1500 

3 GGCTGCCATA TAACCTGACA GGAACaiGCT ACTGAAGTTT ATTTTAOCAT TGflCTGCTGC 15S0 

CCTCAATCTA GAAOQCTACA CAAGAAATAT TTGTTTTACT CSVGCAGGTOT GCXTTTAACCT 1620 

CCCTATTCAQ AAAGCTCXJ^C ATCAATAAAC ATGACACTCT GAAGOXaAAAG TAGCCAC5GAG 1680 

AATTGTGCTA CTTATACTQG TTCRTAATCT GGRGGCAhGG TTCJGftCTGCA GC2GBC00CGT 1740 

CRGOCTGTOC TAGGCRTGGT GTCTTCACAG GAfiGCAAATC CRfiAGCCTOG CTGTtaGQGAA IBOO 

l\J AGTQACCACT CTGCCCTGAC CCCGATCAGT TA7WK3AGCTG TGCAATAACC TTC5CTAGTAC IBfiO 

CTGAGXGAGT GTGTAACTTA TTGGGTTGGC GAAGCCTGOT AAAGCTCJTTG GAATORGTAT 1920 

QTOATTCTTT TTAAGTATGA AAATAAAGAT ATATGTAC2U3 ACTTOTATIT TTTCTCrGGT l&BO 

GGCATTCCTT TAGGAATQCT OTGT6TCTGT OCGGCACXCC QGTAGGCCTG AraSOGTTTC 2040 

TAGTOCTCCT TAACCACTTA TCTCOCSkaCRT GAGAGTGTGA AAAATAG6AA CACGTOCICT 2lOO 

13 ACCTOCATTT AGGGATTTOC TTOGGATACA GAAGAGOGCA TG^IGTCTCAG AGCTGTTAAG 21S0 

GGCTTATTTT TTTAAAACAT TGGAGTCATA GCATGTGTGT AAACTTTAAA TATGCAAATA 2220 

AATAAGTATC TATGTCTAAA AAAAAAAAAA AAA 2253 

Seq ID NO J C44 SNA Sequence 
ZU nucleic Acid Accession Us nm_013372 
Coding sequences 53.. 617 

1 11 21 31 41 51 

9^ I I I ) I 1 

M GOGGCC3GCAC TCftGCGCCAC GOGTCXSAAAG CGCftGGCCCC GAGGACCOSC GGCACTGACa 60 

GTATGAGCCG CACRGCCTAC ACGGaTGGCSAfl CCCTGCTTCT CCtCXTGGGG ACCCTGCTGC 120 

COGCIGCTGA AGGGAAAAAG AAAGGGTOCC AAGGXGCCAT CGOCXXX3CCA GACAAGGCCC 180 

AGC3kCm!6A CTCAGAGCAG ACTCACTCGC CC3C!AGCA60C TGQCXCCbGG AACCSGOGOGC 240 

GGGGCCAAG6 GCOGGGCACT GOCATGCC3CG GGGRGGAGGT GCiaa2U3TOC AQCCMGhGG SOO 

D\J CCCTQCATGT GACOGAGOGC AAATACCTGA AOCOaGACTG GTGCAAAAOC CAGCCBCTTA 360 

AGCAGACCAT CCAOGAGGAA GGCXGC3U«a. GTCGCACCAT CATCAAOOQC TTCTGTTACG 420 

GCC AGTO CAA CTCTTTCTRC ATCOOCAQGfC ACATCCQGAA OGAGGAAOGT ICCTTICAGT 480 

CCXGCTCdT CTQCAAGCC3C AAGAAATTCA CTACCAXGAT GGTCACACTC AACXGCCClG 54Q 

AACTACAGCC ACCIACCAAG AAGAAGAGAG TCACA08TGT GAAGCAGlGT CCTTGCATAT 600 

3D CCATCGATTT BOATTAAGCC AAATCCAQGT GCAOCChfiCA TQTCCTAGGA ATGCAGCCOC 660 

AGOAAGTCCC AGACCTAAAA CAACCA6ATT CTTACTTGGC TTAAACCTAG AOQCCAOAAG 720 

AAOCCCCRQC TQCCTCCTGG C3\QGAGCCTG CTTGTQCGTA OTTOSTGTGC ATGA8TG!tGG 780 

ATGQ gTgQCT GTGGGTGTTT TTAG3UACCA QAOAAAACAC AGTCTClGCT AGAGAGCTCT 840 

. CCCTAlWlt* TAAACATATC TGCTTTAATG GGGATGTACC AOAAACCCAC CICA0CCG66 900 

H\J CTCAmrCTA AAGGGGOGGG GCCX3TSGTCT GGTTCTGACT TTGTCrTTTT OTQCJCCTCCr &60 

<3QGGAC CAGA ATCSTCCTTTC tSGAATGAATG TTCftaJOSRAB AGGCTCCTCT GAC3GGCAAGA 1020 

GB^CIGTTTT AGXGCXGCAT ICGACAtGGA AAAGTCCTTT TAACCXOTGC TTGCATOCTC 1080 

CTTTOCrCCr CCICCTCACA ATCC3VTCTCT TCTIAABTTQ ATM3TG2U7A T6TCA£rTCZA 1140 

- ATCrcrTCTT TGCCAAOGTT OCTAAATTAA TTCaCTTAAC CATGATCGCAA ATOTTTTTCA 1200 

HrD TTrrGTQAAG ACJCCTOCSiGA CTCIGGGAGA QQCTO G ' I G TG GGCAAGGAO^ AGCAOSATAG 1260 

TGGAGTGftGA AAGOGAGGGT GGAGGOIGAG GOCAAATCAG GTCCftGCAAA AGTCAGTASG 1320 

GACATIBCaG AAGCTOXSAAA GGCCAAIACC AiSAACACAGG CTGATGCTTC TGAOAAAaTC 1380 

TTTTCCTAGT ATTTAAOKSA ACOCAAQTGA ACAGAGGAGA AATGAG&TTG CCAGAAA6T6 1440 

ATTAACZTTO GCOSTTGCAA TCTGCTCAAA CCrAAC3U::CA AACTGAAAAC ATTAAATACTG 1500 

DU AOCACTOCTA TGTTCGGACC CRAGCAAGTT AGCTAAACCA AAOCAACTOC TCTGCTTTGT ISfiO 

COCTCAGQTO QAAAAfiAGAG GTAGTXTAGA ACTCTCTGCA TAGGGGTGGG AAlTAAarCRA 1620 

AAA CCKCA GA GGCTGAAATT CCTAATACCT TTOCTTTATC GIGGTTAaAG XOU3CICATT 1680 

TGCAXTOCAC TRTTTCOCAT AATGCrrCTG AQAGOCACTA ACrraWIIGA TftAAGftiKSCT X740 

f f GCXa CCTGCTG AGTGXACCia ACAGTAAGTC TAAAGATGAR ACAGTST3U30 QACTACTCTG 1800 

DD TTTTAGCAAG AHATATTKTG GOGGTCTTTT TGTTTXAACT ATTOTCAGGA GATTGGGCTA 1860 

RAX3AGAAQAC GACJGAGftQTA AGGAAATAAA GGORATTGCC TCTQSCTAGA BftGTAAGTTA 1920 

QOTGTTAATA CCTGGTAGAA ATOTAAGGGA TATGAC2CTCC CTTTCTTTAT GTGCK3UM 19B0 

aGGflJ CTOAB G6GAOCCTPT TAGGAGAGCA TAOCATCATG KEGnOTHOC TQTTCAICIG 204O 

g^CraJTrg GATGGACATA ACIATTGTSUV CTAIICAGTA TTTACTGGTA GGCACIGTCC 2100 

Dl/ TCTGATTAAA CTTGGCCTAC TtSGCAATGGC tEACTT»GGA7 TGATCTAAGG GOCAAAGTGC 2160 

AaaOTGGGTG AACTTTATTG TACTTTGQAT TTGGTTAAOC nSTTTTCTTC AAGGCTGAG6 2220 

TTTTWEMAO AAACTGCClG AATACTCTTT n!aC£3TQTA TCTTCTCA6C CTOClftflCCa 2280 

A^OCTATGT AATA3?GGAAA ACAAACACIQ CA@U3TGAG ATTCSUSTTGC OGATCAAGGC 2340 

TCTGGCATDC l^AGAACGCT TGCAACTOGA GAA6CTGTTT TOATTTOGTT TTTGlrTTGA 2400 

03 TCCftGTGCTC TCCXATCTAA CAACTAAACA GGACCCATTT CAAGGCJOSGA GATATTTTAA 2460 

ACACXX»AAA TGTTGGGTCT GATTTTCARA CTTTTAAAGT CACTACTGAT GATTCTCACXS 2520 

CBAfiScaSAAT TTGTCCAAAC ACATAGTGTG TOTOTTTTGT ATACACXGIA TOACCOCACC 2580 

CCAAATCIXT OTATTGICCA CATTCTOCAA CAAITAAAGCA CAGAOTGGAX TOaOJlMXC 2«40 

ACMAAATGC TAAGGCASAA TrTTGAGGST GGGAGAGAAG AAAAGGGAAA GRAGCTGAAA 2700 

/U ATGTAAAACC ACACC2\fiGQA GGAAAAATGA CATTCAGAAC CAGCAAACAC TGAATTTCTC 2760 

TTGTTGTTTT AACTCTGOCA CAAGAATGCA ATTTCGMAA MQflGATGAC TTAAGTTGGC 2820 

AGCaOTAATC TTCXXTXAOG AGCTTGTACC ACAQTCTTGC ACATAAGTGC AGftTTTatSCT 2880 

CAAGTAAAGA OAATTTCCTC AACACTAACT TCRCTGGGAT AATCAGCaSC G13»CtftOCX? 2940 

TAAAASCATA TCAClASGCA AAGAGGGAAA TATCTGTTCT TCTTACTGTG CCTATATTAA 3000 

/*? GACTAGTACA AATGTGGTGT GTCTTCCAAC TTTCATTGRA AATGCCATAT CTATACCATA 3060 

TTTTATTC5GA GTCACTQATG ATGTAA3GAT ATATTTTTTC ATTATTATAG TAOAATATTT 3120 

TTATGGCAAQ ATATTTGTGG TCITGATCAT AOCTATTAAA ATAATQCCaA ACAOCAAAtA 31B0 

TGAATTTTAT GA TGTA CftCT TTGTGCTTGG CSUTTAAAAGA AAAAAACACA CATCCTGGAA 3240 

OTCrGTAAGT TOTTTTTreT TACTGTnflGT CTTCAAAGTT AAGAtSlGTAA GTGAAAAATC 3300 

TQSAGQAgAG GATAATTTCC ACIGTGTGGA ATQTGAATAO TTAAATGAAA AGrXATGOTT 3360 

ATTTAATCTA ATTAT TACT T CAAATCCTTT GGTCACTGTG ATTTCSIAGCA TGTTTTCTTT 3420 

TTCTCCTTTA TATGACTTTC XCTQAQTTGG GCAAAGAAQA AQCTOACACA CCX3TATGTTG 3480 

TTAGAGTCTT TTATCTGGTC A GGGG AAACA AAATCITGAC OCAGCTGAAC ATGrCTTCCT 3540 

GAGXCAGTaC CTGAATCTTT ATTTTTTAAA TTGAATGTTC CCXAAAi3aTT AACATTTCTA 3600 
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AAGCRATATT AAG^WVGACT TTAAATGTTA TTTTGGaAGA CTTAOGRIGC ATGTATACAA 3660 

ACXSAArAGCA OATAATGATG ACTAOTTCAC ACATAAA6TC CTTTTAAGGA GAAAATCTAA 3720 

AATGAAAAGT GGATAAACAG AACATTTATA A6TGATCAOT TAAT6CCIAA GAGTGAAMST 3780 

AGTTCTATTG ACATTCCTCA AGATATTTAA TATCAACTGC ATTATGTA1T ATGrCTCSCTT 3840 

5 AAATCATTTA AAAACGGCAA AGAATTATAT AGACTATGAG GTACCTTGCT GTOTAGGflGG 3900 

A1GAAAGGGG AGTTGATAGT CTCATAAAAC TAATTTGGCT TCAfiGTTTCA TGAATCTTGTA 39fi0 

ACTAQAATTT AATTTTCACC CCAATAATGT TCTATATAGC CTCXGCTAAA GAQCAACTAA 4020 

TAAATTAAAC CTATTCTTTC AAAAAAAAA 4049 

10 Seq ZD HO: C45 DNA Sequence 

Nucleic Acid Accession tt i Sos sequence 
Coding Beg:uence: 200.*2532 

^ ^ 1 11 2X 31 41 SI 

15 I 1 1 1 ] ] 

ATTGCTGATG GATCAGTGAG CCTGTGTtCA TGCCftGTGiaG CTGCrKTIGGC TCAfiATACTG 60 

ATACITTCTT TCCAAACAGC ATAAiGAAQTO ATTG3U3CCAC AAGTATACTO AAGGAAGGGC 12 Q 

TCXX:TGGAGT TCTOGTGTGA AGAGATAAAT C7LCCAGTCAC ASACTATGCA CCCGACTGCT 180 

GCCGfTVCROr CCABGGAAAA TGAAAC3TTGG AQTGCTGTGQ CTCATTTCTT TCTTCAOCTT 240 

2,0 CACTGACGGC CACGGTGGCT TCCTGC3GG2VA AAATGATGGC ATCAAAACAA AAAAAQAACT 30 Q 

CATTOTQAAT AAGAAAAAAC ATCTAiGGCCC AGTOGAAGAA TATCAGCTGC TGCXTCAGGT 360 

GACCTATAGA GATTCCAAGG AGAAAAGAGA TTTGAGAAAT TTTCXGAJUSC TCTTGAAGCC 420 

rCCAITATTA TQOTCIACATG GGCTAATTAG AATTATCAiGA GCAAAGGCTA CCACAGACTG 4B0 

CAACAGCCTG AATGGAGTOC TGCAGT6TAC CTGTGAAGAC AGCMACACCT GGTTTCCXCC S40 

25 CTCATGCCTT GATCCCCAGA ACTGCTACCT TCACAOSGCT GGAQC31CTCC CAAGCTOTGA 600 

ATGTCATCTC AACAACCTCA GCCAGAGTGT CAATTTCTGT GAGRGAACAA AGATTTGCGG 660 

CACXTTCAAA ATTAATGAAA GOTTTACAAA TQACCTTTTQ AATTCATClT CTGCTATATA 720 

CTCCAAATAT GCSU^TGGAA TOXSAAATTCA ACTTA»^AAA 6CATATGAA71 GfiATTCAAGG 780 

TTTTGAGTCG GTTCAGGTCA CCCAATTTCG AAATGGAAGC ATCGTTGCrG GGTATGAAGT 040 

30 TGTTGOCTCX: AQCAGTGCAT CTGAACTGCT GTCAGCCATT GRACATGTTG CCGAGAAGGC 900 

TAAGACACCC CTTCACAAGC T6TTTCCA1T AGAJiSACGGC TCTTTCAGflG TGTTOGQAAA 960 

AG&CCZAiaraT AATOACATTG TCTTTGGATT TGGGTCX3VAG GATGATGAAT ATACCCTGCX: 1020 

CTGCAGCAl^r GGCTACAOGG GAAAGATCAjC! AGGCTiAGTQT GAlGTCCTCTO GGTGGCAGOT lOaO 

CATCAQGGAG ACTTGTGTGC TCTCTCTGCT TGAAGAACTG AACAAGAATT TCAjSTATGAT 1140 

35 TGTAGGCAAT GCCACTGAGG CAaCTGTQTC ATCCTTCGTG CAAAATCTTT CTGTCATCaT 1200 

TOQQCAAAAC CCATCAACCA CAGTGQGGAA TCTGGCTTCG GTGGTGTOGA TTCTGAGCRA 1260 

TATTTCKICI CTGTGACTGG CC»GCC»rTT CAGOGTaTCC IVATTCAAjCAA TGaAGOASOT 1320 

CRIC3U3TATA GCTTGACAATA TOCTTAATTC AGOCTCAlSXA AOCAACrGGft CM3TCXXACI 1380 

. - 60GG6AAGAA AAGTATGOCA 0CTCAO3GTT ACTASAGACA TTAOAAAACA TCAGCACTCT 1.440 

AO OGTGCCTCOG ACAGCTCTTC CTCTGAATTT TTCTCGGAAA TTCATTGACT QGAAAGGGAT 1500 

TCCAGTGAAC AAAAGCCAAC TCAAAAGOGQ TTACAQCTAT CAGATTAAAA TOTGTCOCCA 15 GO 

AAATACATCT ATTGCCATCA. GAGGCC6T6T 6TTAATTGGG TCAGACCAAT TOCaVSAGATC 1620 

OCTTOCAGAA ACTATTATCA GGATGGCCIC OTTGACTCTS GGQAAaLTTC TAOCCOTTTC 16S0 

caiAAAATGGA AATGCTCRGG TCAAT^SAOC TCTGATATCC AOGSTTaTrC AAAACIATXC 1740 

45 CAarAAATOAA QTTTTCCTAT TTTTTTCCAA GATAGAiGTCA AAGCTGRGCC AGOCTCATTG 1800 

TG7X3TTTTGG GATXTCAGTC ATTTGCAGTG GAACGATGCA GGCTGOC:aCC TAGTGAATGA 1860 

AftCTCAAGAC ATCQTOAC3GT CCCRATCTAC TCACTTGACC TOCTTCTCCA TATTCATGTC 1920 

ACCTTTT6TC OOCTCTACAA ^CCTXCCCCGT TCTTAAAATGe ATCAOCTATG 1OG6ACXG06 1980 

TATCTCCATT GGAAGTCTCA TTTTATGOCT GATCATOGAG G CiVrUTi 'TT GGAAGCAGAT 2040 

5U TAAAAAAAGC CAAAGCTCIG ACACACGTCG TAXXTGCA^TG GOXSAACAtmS CCClOTCXXT 2100 

CTTGATTGCT GATGTCTQQT TTATTGTTGO TGCCACA6TG GACAOCAOGG TGAACCCTTC 2160 

^rOQAGTCrGC ACAGCTOCXG TGTTCXXXAC ACftCZICTXC TACGTCXCII TGIICITCIG 2220 

GATGCrCATG CrSQGCKFCC TGCIGGCTXA COGQATCATC CTOOTGTTQC ATCACAXOOC 2280 

CCAaCniTie ATGATGGCTG TTGGATCTIG CCTGG6TTAT OG6XGCOCTC TCATTAXATC 2340 

55 TGTCATTACC ATTGCTGTC!A CGCAAC(!7TAO CAATACCTAC AAAAGGAAAG ATGTOTGTTG 2400 

GCTTAACTQQ TCCAATGGAA GCAAACCACT OCTGGCXTTT GTrGTCCCTG CACTQGCTAT 2460 

TQTGGGT6TG AACTTCCSTTG TOaTGCTacr AGTTCTCACA AASCTCTGGA GGCGQACrGT 2520 

TGGGOAAAISA CTQAGTCGGQ ATGACAAGGC CACOViTCKrC 08G8VGGGGA ASnSCCIGCT 25B0 

^- CATTCTGAOC CCICTGCTAG GGC^FCACCTQ GGGCTXTQOA AZAQGAACAA TAQTQGACAQ 2640 

DU CX]AGAAXCTG GCTXGGCATG TTATTTTTSC TTTACTCAAT GCATTCCAGG GATTTTTTAT 2700 

CTTATGCTTT GGAATACTCT TGGACAGTAA GCTGCGACAA CTTCTGTTCA ACAAGTTGTC 2760 

TOCCTTAAGr TCTTGSAAGC AAACAGAAAA GOUUMCTCA TCAQATTTAT CTGGCAAACC 2820 

C3MTTCICA AAGCCTTTCA AOCXACIGO^ AAftCAAAGGC CMmOGCMC mCXCATAC 2880 

^ TGSAX^TTOC TC0(3ACAIIC3L TCATOCTAAC TCAOTTTaTC TCAAAXGUT AAGGCAAGGA 2940 

65 ATCATAAAAT CAAGAAAAAA TTTCCAGAAC AACTTGACKT ASGXCAATGA 300O 

AGAAATTATG CTCA6TATTG OATCGQOTTT TCTGATTTAO GOOTCTGOQA ATAAAACAAG 3060 

AATGTOTCAG TGGCTTCA 3078 

Seq ID NDt C46 DNA Sequence 
70 xaucl^ic Acid Accession #t 1101.000584*1 
Coding aequences 75.. 374 ~ 

1 11 21 31 4X 51* 

^< I I I I i I 

/3 AGCAGAGCAC ACAAGCTTCr AOQACAAQAG CCAOGAAI^AA ACX3U:0GGAA GGAACCATCT 60 

CftCTGrGTOT AAACATGACT TCCRAGCTGG CXKTGGCTCT CTTGGCAGOC TTCCTGATTT 120 

CTGCAOCXCT GTGTGAAaOT OCAGTTTTGC CAAGQAGTGC TAAAQAACTT AQATOTCAGT 100 

GCAllAAAGAC ATACTCCAAA OCTTTCCAGC CCAAATTTAT CAAAGAACTG AG&lSIGATTG 240 

^- AcaABTGGACC ACACTGCOCC AACACAGAAA TTATEGTAAA GCXTTCIGAT GaAM3AGM3C 300 

OU TCTGTGTGGA CXXCAAGGAA RACTGGGTQC AGAGGGTTQT GGftGAAGTTT TTGAAGAGQG 360 

CIGAOAATTC ATAAAAAAAT TCATTCTCTG TGGIAXCCAA GAATCAGTGA AGATGCCAGT 420 

6AAACTTCAA OCAAATCTAC TTCAACACTT CATGTATTGT GTGGGTCIGT TGTAGGOTTG 480 

CCAjSATOtAA TACAABKnC CTGGTTAAAT XTGAATITCA GIAAAX»ATG AATAlGXTTTT 540 

CATTGTAOCA TSAAATATCC AGAACATACT TATA7GTAAA GXATTA7TTA TTTGAATCIA 600 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



CAAAAAACAA 
TACAAATAGC 
6AATGGGTTT 
TTTOCCftTAA 
CCTAGTCTGC 
TTTCTJWGTG 
TGGAAfiCACT 

ctatttatta 
agaaoatgaa 
oatattaaat 
acaattgggt 
agtacattat 
astcttgtca 

AACTATTflAR 
TOTTTATTAT 
AM3GCXTTAT 
TTCIC3ATT6T 
AA6IAAAAAA 



CAAATAATTT 
AAAATTQAGC 
GCTAGAATCJX 
AGTCARATTT 
TAGCCAGGAT 
GAAAAAGTAT 
TTAAGTTTTT 
TTTATGTATT 
TCATTGATTQ 
GATGTTTTAT 
ACCCAOTTAA 
TGTTTATCTG 
TTQCCAGCTG 
ACA6CCAAAA. 
GTACAAATAG 
ATTTTTAACT 
ATGGAAATAT 
AAAAAAAAA 



TTAAATAIAA 
CAAGQGCCAA 
GATAITTOAA 
A3CTGGAAAT 
OCACAAGTCC 
TAGCCACCAT 
TCATCATAAC 
TATTTAAQCA 
AATAGTTATA 
TAGATAAATT 
ATITTTCATTT 
AAAfiTTTTAA 

tgttggtagt 
ctccrcagtc 

ATTCTTATAA 
TTAftGATQTT 
AAAAGTAAAr 



GGATTTTCCr 
GAGAATATCC 
GCATCACATA 
CCTG3ATTTT 
TTGTTCCACT 
CTTACCTCAC 
ATAAATTATT 
TCAAATATTT 
AAGATOTTAT 
TC3\ATCAGG!G 
CAGATAAACA 
XtCGAACTAAC 

AATATTAGTA 
TATTATTTAA 
TTTArGTCaCr 
ATOAAACATT 



A£5ATATTGCA 
GAACTTTAAT 
AAAATGATGG 
TTTCTOTTAA 
GTGCCTTGGT 
AGTOATOTTG 
TTCAA6TGTA 
STGCAAGAAT 
A6TAAATTTA 
TTTTTAGATT 
ACAAATAATT 
AATCCTAGTT 
ATTAOGGAAT 

ATGACTGCAT 
CTOCAAATTT 
TAAAATATAA 



CGGGAGAAl-A 
TTCAQtaAATT 
GACAATAAAT 
ATCIGGGAAC 
TTCTCCTTTA 
TGAGGACATQ 
ACTTATTAAC 
TTOGAAAAAT 
TTTTATTTTA 
AAACRAAGAA 
TTITA6TATA 
TOATACrCCC 
AATGAGTTAS 
QGTTGftAACT 
TTTTAAATAC 
TTTTTACTGT 
TTTGTTQTCA 



Seq ID NOt C47 DNA Sequence 

Nucleic Acid Accassion ttx KM_005603,1 

Coding sequence x 1..3756 



1 

1 

ATSAGTACAG 
GTGGTTCCCT 
GAACCAOAAC 
QAAIGTACAO- 
AACACAAAAT 
AAGTACAAGG 
AATTTATATT 
TGGTRCACCA 
GTOQACGATG 
ATTAAGGATG 
CJOTCCOAAAA 
AACAGOCTCT 

OGTXTTATTG 
TGOAGAAACA 
AGGAACAOCO 
AAGAATAGTG 
GTTTACACGR 
TATT GGGRftG 
CCCTOCTACC 
OCCATCTCTC 
TGGGACCTQC 
CTCAATGAAC 
CAAAATAICA 

eAxoccTcrc 

OCTSATGGGA 
GAGCC3U3AAG 
AGGACTGATO 
GCEGCCAGGA 
QAACIGGGCA 
AAGOC3AATGT 
GCTGACACTG 
GftZaOCCTGG 
ATTGaaGAAA 
ACCAACCOGG 
CTGGGAGCTA 
CTTGCAAAAG 
AATATAI3GAT 
ATTAATICrC 
AaSTTTGCAC 
ATCACTGGTr 
CTGAAGCOIGA 
AGGCTAGAAG 
ACSCOCAQTCA 
AAGAGGTACA 
ATCAAAACTG 
TCGAGTGACT 
CGATGCTCTT 
TTTAlCXTTGG 
GROGArrGGT 
GQGCTGCTGG 
GTGGOACAAA 
GTOCTAACAT 
CASGAlXaQAS 
GIAATAACAG 
TTTTCAATTT 
GGAATACATG 
AGACA6CCAT 



11 
I 

AAAGAGACTC 
ACAGTGATGA 
AAAACCGAGT 
GGCAAGTCAA 
TCTTGXGTAT 
CATTTACCTT 
TCCTGGCTCT 
CACTAGTGCKr 
TGGCTCX3CC31 
GCSWSGTTCAA 
AAAATGATTT 
QCTATGTGGA 
AAATC3lCn(3A 
AATGTGAAGA 
CAAGTTTTCC 
ATTTCTGCCA 
GGA2yiACCAG 
TCTTTGTTGT 
CACAGGTGGG 
GTGGAITCCT 
TCTATGTCAG 
AAAIGTACTA 
AGCrOGGGCA 
TGAOCTXrAA 
AACAiClAAOCA 
AGCTTGCATT 
TACQACAGTT 
GTCAGCTCAA 
ACTTTGQCTr 
CTGAAAGGAC 
CTATCATTGr 
TTATTTATGA 
ATATCTTTOC 
AAGAATTTAC 
ACGAAGCTCT 
CAGCTATTGA 
CTGACATTAA 
TTGCITQTQA 
TTCTTCATGC 
CIGCIGTGCA 
CTTCfti'iiGAA 
AGTTCCCAAG 
CTAAGAAAGA 
TCXGCTGCaa 
AGAAAGCCaVT 
CCCACATTQQ 
ATTCCITTGC 
ACATAAGGAT 
TTCATTTCTG 

AGCAGQATGT 
GAGACTXACr 
CGATGATCCr 



TCAATTTCCA 
TI TOAGC AT 
TTCTCTTTCC 
AQVTTTGCTT 



21 
I 

AGAAAOGACA 
TGAAACAQAA 
CAACAGGGAA 
AGC!AAA£9GAT 
TAAGGAGAGT 
TATACCAATO 
TCTTATCTTA 

cciGCTTora 

TAAAATGGAT 
AGTTGCTAAQ 
TGTTCCAGCT 
AA£:AGCAaAA 
CCAGTACCTC 
ACCX^AATAAC 
TTTGGATGCT 
OGGCTTAGTC 
ATTTAAAAGA 
TCTTATTiCTG 
CAAITCCTCT 
CATTTTCTGG 
CGTGGAAOTG 
TGCTGM3AAG 
GATCC3VTTAT 
AAAGTGCTGT 
CAACAAAATA 
TTATGACCAC 
CTTCTTCTTG 
CIACCAQGCA 
TGCCTTCCTC 
Tl^ACAATQTT 
AAGAAOCX3CA 
AOGQTTACAT 
AAATGAAACT 
AGAArr(MAAT 
GGATAAA6TA 
AGACAAGCTA 
gatctgggtg 
ACTTCTGAjCT 
AAGGATGGAA 

GGAATcamr 

TGARATTCTT 
AACAGAAGAA 
GCAGCGGCAG 
OGTCAOOCCC 
CACGCTGGC2C 
CGTTQGAATA 
TCAOTTCCGA 
GT6CAAGTTC 
6TACTC5CTTC 
CTACAACGTG 
GAOTGACAAA 
ATTC3UU:TAT 
CTTCTTCATA 
CGACXACCAG 
GATTGGCTTG 
TGCACTTTAT 
ATCTGCATTT 
AACTATCATC 



31 
I 

TXTGACQAOG 
GA1GAACTTG 
GCAGAGGAGA 
CGCAAGXACC 
AAATAW3CQA 
AATCTQTTTG 
CAGGCAGTTC 
DTGCTGGGC6 
AAGGAAATCA 
TGGAAAGAAA 
GACA'TTCTCC 
CTGGACGGAG 
CAAAGAGAAG 
C5GACTAGATA 
GATAAAATTT 
ATTTTTGCAG 
ACTAAAATTG 
CTTTCTGCiG 
TGGTACCTCT 
GGCTATATCA 
ATTCGTCTTG 
GACACACCCQ 
ATCTTCTCTG 
ATCAACGGGC 
GAQCAAGTTG 
TATCTTATTQ 
CTOGCAGTTT 
GCSCTCTCOOS 
6CCAGGACCC 
CTTGCCATTT 
GAAGGCAATA 
GGAA1GAATC 
CTTAGAACCC 
AAA7UVSTTTA 
TATGAGGAGA 
CAGGATGGAG 
CTTACZGGAG 
QAAOaCACCSV 
AAC!CAGAGGA 

TXTCCAccxaa 

CrCGAGAAAA 
GAAAGAOSBA 

AAAAAcrrre 

AAGCAGAAG6 
ATGGGAGATG 
AGTGGACAAG 
TATCXGCauGA 
CTAOGATACX 
TTCAATGGCT 
CTGTACACCA 
CTGAQCCTCC 
AAGAGATTCT 
CCTCTTGOAQ 
TCTTTTGCCG 
GArACTTCTT 
TM^GGCATCA 
CAATTTACAO 
CrGACTGTTG 



41 
I 

ATTCTCAGCC 
ATGACOUSOG 
AGCGGGAjGCC 
ACGAACAACC 
ATAATGCAAT 
AGCAGTTTAA 
CTCAAATCTC 
TCACTGCAAT 
ACAATAGGAC 
TTCAAGTTQQ 
TGCTGTCTAG 
AAACCAATTT 
ATACATTX3GC 

agtttacagq 

TGTTACGTGG 
GTGCTGACAC 
ATTACTTGAT 
GTCTTOCCAT 
ATGATGGAGA 
TTGTTCTCAA 
GACAGAGTCA 
CAAAAGCTAG 
ATAAGACGGG 
ASATATATGG 
ATTTTAGCTG 
AGCAAATCCA 
GCCACACACr 
ATQAAG9TGC 
AGAACAOCAT 
TGGACTTCAA 
TCAnSCTTTA 
CTACTAAOCA 
TATGGCTTIG 
TGGCTGCCauS 
TTCRAAAAGA 
TTCCAGAAAC 
ACAAAAAGGA 
CCATCIGCTA 
ATAGAGGTGG 
GTGG/LAACCG 
AGACCAAOA6 
TGOGGAGCCA 
TOQACCTGOC 
OCATGSTGOT 
GGGCGAAO^ 
AAGQAKTaCA 
GGCTACTGCT 
TCTTTTACAA 
ACTCTGOGCA 
GCXITGCCCGT 
6ATTCCCTG6 
TTSXAAGCIT 
COTATCTGCA 
TCAOCATTGC 
ATTQGACTTT 
TGTTTGACTT 
GCACAGCTTC 
CT6TGTGCTT 



SI 

1 

TAATGAC6AA 
OTOTGCTCTT 
ATTCAGAAAA 
TCA£3TTATG 
TAAAACATAC 
GAGAGC3U3CC 
TACCCTGGCr 
CAAAGACCTG 
GTGTGAAGTC 
AOACGTCATT 
CTCTGAGCCT 
AAAATTTAAG 
TACATTTGAT 
AACRCTATTT 
CTGIOTAATT 
TAAAATAATG 
6AACTACATG 
OGGOCA1GCT 
A0ACGATACA 
CACXM^IGGTA 
CnCATCAAC 
AACCAOCS^ 
GACACTOVCA 
GGAOCATOGG 
GAATACATAT 
GTCAGGGAAA 
CATGGTQCnT 
OCTGGIAAAC 
CACCATCnOT 
CAGTGAC03G 
CTQTAAAGGT 
AGAAACACRG 
CCAiCAAGGAA 
TGTGGCJCrOC 
CTTAAWCrc 
CATTTCAAAA 
AACTGCKSAA 
TGOGaAaGAT 
CGTCTACSSCA 
TQCCTTAATC 
AAATAAGATT 



CTGGGAGXGC 
GGACCTGGTG 
CGTGAAO^TG 
AGCTGTCATG 
GGTGCaXGGC 
AAACTTTGCC 
GACTGCATAC 
GCTCCTCATG 
GTTATACATA 
GT7GCATGG6 
AAOCGTAGGG 
CTCTGCTCTT 

TGTGAAtracr 

TCATAGTGCI 
AAACGCICTQ 
ACTACCGGTC 



660 

720 

780 

840 

900 

960 

1D20 

1080 

1140 

1200 

1260 

1320 

1380 

144D 

1500 

1560 

1620 

163d 



60 
L20 
LBO 
240 

aoo 

360 
420 
480 

540 
600 
660 
720 
780 

aao 

900 

960 

X020 

1080 

1140 

1200 

1260 

1320 

1360 

1440 

1500 

1560 

1620 

1660 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2260 

2340 

2400 

2469 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

34B0 
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80 



GTTGCCATTC GATTCCTGTC AATGACXATC TGCaCCATCAiG AAAOTGA.TAA GATCCAGAAG 3540 

CftTCXSCAAGC GGTTGAAGGC OGAGGAGCAS TGGC34GCGAC GGCAGCAG6T GTTCCGC3C(5G 360O 

GGCGTGTCAA CGOSQCXSCTC GGCCIACGCC TTCTCQCftCC AGCGQGQCTA CXJOGGAOCTC 3660 

ATCTCCncCG GGGGCAGCA-r CGOCAAGAAS CGCTC5G0OGC GGTGGCGGAT 3720 

GGCACOGOSG AGTACRGGCG CACCI3GGGAC AGCTGA 37S6 

aeq ID lilOt C48 DNA Sequence 
arucleic Acid Accession ^i. XM_044533 
Coding aegoence: 23a.. 2721 ~~ 



1 11 21 31 41 51 

1 I I 1 i I 

GCTCTGCCCA AGCCGAGGCT GOBGOGCOQG GGCOGGCQGG AGOACTGCGCS TGOOCOGOGG 60 

AGSGGCTGAG TTT0CCA<3GG GCCACTTGAC CCTGTTTCCC ACCTCCOQCC CCOCMSOTCC 120 

1j QGAGGCGQGG GCCCCCXSGGG OQACTOGGGG GCOGACCQCQ QGGCC3GW3CT GCCGCCCGTG 180 

AGTCOGGCOG AGCCACCTGA GCCCGAfiCCQ CGGGACACCG TOSCTCCTGC TCTCCGAATG 240 

CTCCGCACCtS CGaTGGGOCr GAGGAGCTGG CTCGOCGCCC CATQOQGCGC GCTGCCGC3CT 300 

OGQCCACOGC TGCTSCTGCT CCTGCTGCIG CTGC7CCT0C TGC3U3C!C(3CC QCCTCCSGACC 360 

on TOStSCSGCTCA 6CCG00GGAT CAGCCTGCCT CTGGGCTCTG AAISUSOSGCC ATTCXI^CAGA 42 Q 

ZU TTCGAIVCSCTG AACACATCTC CAACTACACA GCCCTTCTGC XGAGCAGGGA TQGCA06ACC 4B0 

CxaXACGTOG GTGCTCGAGA GGCCCTCTTT GCACTCAOTA GCRACCTCRG CTTCX:TGCCA 540 

GGCGGGGftGr ACCAGGAGCT GCTTTGOaOT GCAGAOGCAG AfiAAGAAAOV GCAGTGCAGC 500 

TTCAAGC3GCA AGGACCCACA QOGCGACTGT CAAAACIACA TCAAGATC5CT GCTGCCGCTC S6Q 

„ AGCCGCASTC ACCTGTTCAC CTGTGGCAC3L GCAGCCTTCA GCCCCATSTG TACCTACATC 720 

ZD A2U2ATGaAGA ACTTCACCCT OGCAAGGGAC GAGAACGGGA ATOTCCTOCT GGAAGArr^C 780 

AAGGGCCGTT QTOCCTTOGA CCCGAATTTC AAGTCCACTG CCCTGGIGGT TGATOGOGAG B40 

CTCrACACTG GAACAGTCAG CAGCTTC3CAA GGGAA-XGACC CQGCXaLTCrC GCGGftGCCAA 900 

AGCCTTG60C CCAOGAAGAC CSSAGAGCTOC CTCAACTGGC TGCAAGJUXXT A6CTTTTQTG 960 

OA SCCTC RSCCT ACATTCXrrCA GnSCCTGGGC AGCTTGCAAG GGSATOATGA CAAGATCXAC 102 0 

OV TTTTTCTTCA GC3GAGACTGG OCAGGAATTT GAGTTCTTTG AGAACACCAT TGIGTCCOGC 1080 

ATTGCCCQCA TCTGCAAGQG CGATQAGGGT GGAGAGCGGG TGCTACAGCR GOGCTGGACC 1140 

TCCTTCCTCA AGGCCX3U3CT GCTGTGCTCA C^GGCXZOmOG ATGGCTTCOC CTTCAAGQIQ 1200 

ClGC ftBOTJ G TCTTC&QGCT GAGCCCCAGC CCCCAGGACT aaCSGTIQAiCAC CCITTrCTAT 1260 

6GGGTCTTC2L CTTCCCAaTG GCACAOQGSA ACTAC3U3AAG GCTCTGCCXsT CTGIGTCTTC 1320 

J J ACAATOAAGG ATGTGCRfiAG ASTCTTCAGC GGCCTCTACA AGGAGGTGAA COGTGAiGACA 1380 

CAGCAGTGGT ACACQQTGAC CCACCOOC?lG CCCACACCOC GGCCTQCSAGC GTQCATCACC 1440 

AACRfiTGCCC GGGAAAGGAAi GATCAACTCA TCCCTGCftGC TCCCAGACOG OGTGCl'GAAC 1500 

TTGCTCAAGG AGCACTTGCT GATGfiAlCGGQ C2MaaTCCGAA GOGGCATGCX GCrOCTGCTG 1560 

OCXSCAOGCTC GOTGGCTGTA CACGGCGTCX: CTGGCCTGCA CCACACCTAC 1C20 

4U GATGTCCTCT TCXJOGGCAC TGGTGACGGC G3GCTCX3VCA AGGCAGTGAG CGTGGGCCOC 1^80 

CGGGTGCRCA TCAT1X3AGGA GCTGCAGATC TTCTCRTCGa GACAGCCXBT GCAGAATdG 1740 

CTCCTGGACA CCCAC^GGGS GCXGCIQXAT GCGGCCICAC ACTCGGGCOT AGTOCAGGTG 1600 

OO MgGO CA ACTGCawSCCT GXACRGGftGC TCTGOSQACT GDCTCCTOSC CCBSGACCCC 1B60 

. « TACTGTGCrr OSAGOGGCTC CJUSCTGCAAQ CAOGTGM3GG TCEACCSlGGC TCSVGCIGQCC 1920 

ACCAGGOCGT GGATCX!AOGA CAT06AGGGA QCX31GCGCCA AGGACCTTTG CMOGOGTCT 19B0 

TOQGTTSTGT CCCOGTCTTT TGTACCAACA QOGGAGAAGC CATGTGAfXA AGTCCftGlTC 2 040 

CAGCCCAACA C2ySlGAACAC TTTGGC5CTGC COQCTOCXCT CCAACCOJQGC GACCXS3ACTC 2100 

TOGCXAX3GCA ACGGGGGCCC 03TCAA.TGC3C TGGGCCiaCX GCCACGTGCI ACCCACXOSG 2160 

_ GACCTGCTOC TGGTGGaCAC GCAACAGCTG OOGGA0TTOC ASTGCIGGIC ACTAGAOGAG 2220 

JU GGCTTCCAfiC AQCXGGTAGC CAGCTAC7GC CCAGAQGTGG TGGAGGAOGG GGTGGOUSAC 2260 

CAAACAGATG AGQGTOGCAG TGTACCGGTC ATTATCAQCA CATCGCGTGT GAGTGCACCA 3340 

GCTGGTGGCA Af3GOCAGCIG GGQIGCAGAC AOGTOCTACI GGAAGGAGTT CCXGGTGAIG 2400 

TGCAGGCTCT TICtGCTGGC 05TGCIGGIC CXA6TTTTAT TCTTGCTCIA CGGGCACCGG 2460 

AACM3CATGA AACaTCTTGCT GAAGCAGGGG GAATGTCGCA aCSt&QkOIX CAAGACCTGC 2520 

JJ CCIGTGQT6C TGCGCXXTTQA GACCGGCCCA CICAAOGGGC TAGG60GCGC TAGGAOCCCG 2580 

CTOGATCACC QaOGOTAOCA GTCCCTOTCA GACAiSGCCCC OGGQGTCCCO AGTCTTCACT 2640 

GagTC ftGftGft. AGAOGOCnCT CAGOVTOCAA GACAGCTTOG TGGAGGTATC CCCAGTGTGC 2700 

CCCXSGCXaCC GGGTCOGCCT TGQCTOGGAG ATCOGIGACr CIGTOGTBTO AlGAGCICACI 2760 

TOC^aM3GAC GCTGCCCTSS CTTCAGGGGC TOTGAATGCT CGGAGAGGGT CAACTGQACC 2820 

TOOOCrCOSC TCTGCTCTTC GXGGAACACS ACX3GTGGTGC OOGGCCCTTG GGAGCCTTGG 2880 

GGCCAGCTQG CCTGCTGCTC TCCAGTCAAG TAGCGAAGCT CCTACCAOCC AGAC&CCCAA 2940 

ACAGOCGTGG CCCCAGAQGT OCTGGOCAAA rTATOGGGGCC TGOCTAGGTT GGTGGAACAG 3000 

TGCTOCTTAT GTAAACTGAG CJCCTTTOTTT AAAAAAGIVA.T TCCAAATGTG AAACIASAAr 30^0 

GAG^GGAAG AGATAGCATO GCATGCAOCA CACAGOGCTO CTGCAGTTCA» TGGCXITCCCA 3120 

DD GGGGTGCTGG GOATGCATCC AAAGTGGTTa TCTGAGAO^ AGTTGGAAAC OCTGACC3AC 31B0 

TOacCTCTTC ACCTTCCACA TTATCCCX3Cr GCCACCGGCT GCOCTGTCTC ACTGC3GATT 3240 

CAflOACCajSC TTGGGCTGC6 TGCXSTTCTSC CTTGOCftGTC AGCOGAGGAT GXaGTIGTTG 3300 

CTGCCGTOGT CCCACCAGCT CAGGGACCAG A0GGCTA06T TGGCACXGCS OCX3CTCACCA 3360 

«vv QGTCCTOGGC TCGGAOCCAA CTCCTGGACC TTTCGftfiOCT OTATCAGGCT GT0GGC31C&C 3420 

G3«»SGAiCAG CQCGAGCTCA GGAGAGATTT CBTGACAATG TACQOCTTTC CCTCAGAATT 3480 

CRGOGAAGAG ACTGTCGCCT GCCTTCCTCC GTTGTTGCQT GAGAAOOCGT GTGCCCCTTC 3540 

CCAOCAJATC CACCCTCGCT CCATCCITGA ACTCAAACAC GAdOGAACTAA CTGC3VCGCTG 3600 

GTCxrrcxcxx: cagtcoccaq ttcacocigc atccctcacc ttcx:tcgaict cxaaggoata 3660 

„^ TCA2« auaGC DCAO CACAGS GGOCXmaAAT TTATSTGGTT TTTATACATT TTTTAATAAG 3720 

/3 ATGCACITTA TGTWMTTTT TAATAAAGTC ntSAAQAATTA CTOTTT 3766 



seq □> HOt C49 DNA 8eqiience 

£IucIelo Add Aoceeslon #s I9M_007D19.1 

Coding sequence i 41 «^ 580 

1 11 21 31 41 51 

i I t 1 I I 

QGCAOSAQOG ASTTOCTGTC TCTCTOCCAA OSGCGCOCSGG ATGGCTTCCC AAAAC0GC3GA 60 

1228 



wo 03/042661 



PCT/US02/36810 



5 

10 
15 



25 



30 



45 



55 
60 



70 



75 



CCCAGCCX3CC ACTAGCGTCG CCQCCaCXlCG TAAAGGAGCT GftGCXXSAGCG C3GGGC3GOC5GC 120 

CCGGGGTCCG GTGCSGCAAAA GGCTACAGCA QOAQCTGATG ACCCXCATfiA TGTCTGGCSA IBO 

TAAAGGOATT TCTGCCTTCC CTQAATCACxA. CAAOCTTTTC AAATGGGTAG GGACCATCCA 24 O 

TGGAOCA6CT GGAACAGTAT AT6AAGAOCT GROGTATAAG CTCTCQCTAG AGTtCCCCAG 3 00 

TGGCTACX:CT TACAA.TGCGC CCACAGTQAA GTTCCTCACG CCCTGCTATC ACCCCflACGT 360 

GGAC3y:CX2flG GGTAACATAT GCCTGGACAT CCTGAAGGAA AAQTOGTCTQ CXXn:<JTA1t5A 420 

TGTCAGGACC ATTCTGCTCT CCATCCAGAG CCTTCTAGGA GAACCCAACA TTGATAGTCC 480 

CTTGAACACA CATGCTGCCG AQCTCTG8AA AAACCCXZACA GCTTTTAAGA AGTACCTGCA 540 

AGAAACCTAC TCAAAGCAQG TC3UXiWGCC3l GGftGCCCXGA CCCAGGCTGC! CCkGCCTXnC 600 

CTTGTGTOGT CTTTTXAATT TTTCCTTAGA TGGTCTGTCC TTTTTOTGAT TTCTGTATAG 660 

GACTCTTTAT CTT@AGCT6T G6TATTTTTG TTTTGTTTTT GTCTTTTAAA TTAAGCCTGG 720 

GTTGAGCCCr TGTATATTAA ATAAATGCAT TTTTOTCXTTT TTtTAAAAAA AAAAAAAAAA 780 

A2«. 783 

6eq m NO: C5Q UNA Sequence 

XSOiclelc Acid AccesBion fts Mf_014584.1 

Coding aequencoi 227.. 1633 



1 11 21 31 41 SI 

20 I I I I I I 

OCACQAGOCC OOGGCTGCCG GCGCOG6CGC 08GC3GCACX3T CCSlCAGGCrG GGTCGGBAGQ 60 

TGGCGATCGC TGAGAGGCAG GAGOGCCGAG GCQC5GCJCTGG QAGGCGQGCC GGAGGTGGGG 120 

CX3CC3GCTGGG GCOGGCCCGC ACQQGCTTCA TCTGAGGGOG CAC06CCCX3C GU^CCGAGGGT 180 

GOGGACTGGC CTCCCAAGCG TGGGGGGACA AGCTGCCGQA GCTGCAATGG GOOGOGGCTG 240 

GGQATTCTTG TTTGGCCTCC TGGOGQCZCBT GTC5GCTGCTC AGCTOGGGCC ACGGftGAGGA 300 

GCaSOCCGCG GAOACASCGG CACAGAiOGTG CTIClGCCAG GTTAaTGQT? ACITOGAT6A 360 

TTOTAOCrGT GATGTTGAAA CCATTGATAS ATTTA&TAAC TACftGSCmT TCOCAAGACT 420 

AGAAAAACTT CTTGAAAGTG ACTACTTTAG GTATTACAAG ^AMiCCXOA. A6AGGCCGTG 480 

TOCTTTCTGG AATGACATCA GCCAGrGTGG ARQAAQGGAC TOTGCTGTCA AACCRTGTCA 540 

ATCTGATGAA GTTCCTQATa OAAITAAATC TGOGAGCTAC AftfiTATTCTG AAGAMCCRA 600 

TAATCTCATT GAAGAATGTG AACAAGCTGA 2\OGACETOSA GCAGTGGATG AATCTCTGAG 660 

TGAGGAAACA CAGAAGQCTO TTCTTCAGTC GACCAAGCAT GATSATTCTT CA6ATAACTT 720 

CTGTJ3AAGCT GATGACATTC AGTCCCCTOA AGCTOAATAT GTAGATTTGC TTCTTAATCC 780 

^ _ TGAGOSCTAC ACTGGTTACA AGGGACCAGA TGCTTGGAAA ATATGOAATQ TCATCTAOGA 840 

Dt> AGAAAACTGT TTTAAGCCAC AfiRCAATTAA AAGACCTTTA AAT0CTTTG6 CTICTGGTCA 900 

AGG6ACAAGT QAASAQAACA CTTTTTACAG TTGGCTAGAA GGTCTCTGTB TAGAAAAAA6 960 

AGCATTCTAC AGACTTATAT CTGGCCTACA TGCTAGCATT AATGTGCATT TGitfSXtsCAAG 1020 

ATATCTTTIR CftAOAGACCTr GGTTAGAAAA GAAAIGGGGA CAC3UVCATTA CAGAATTTCA 1080 

ACAaDGATTT 6ATGGAATTT TQACraAAGQ AGAAGGTCCA AGAAGGCTTA AGAACTTOTA 1140 

4U TTTTCrClAC TTAATAGAAC TAAGGGCITT ATCCftA&GTQ TTACCATTCT T06AG0GC0C 1200 

AQATTTTCAA CTCTTTACIX3 GAAATAAAAT TCAGGATGftG GAAAAGAAAA TGTTACnCT 1260 

GGAAATACTT CATGAAATCA AGTCATTTOC! TnQcaTTTT OWTQAGAATT CATTTTTTGC 1320 

TGGGGATAAA AAAGAAGCAC ACAAACTAAA GGAGGACTTT GGACTGCRTT TlWaftAATAT 13 BO 

TTCAflSAATr ATGGATTGTG TTGGTTGTTT TAAATOTOGT CTGTGQGGAA AfiCTTCSieAC 1440 

TCAGGGTTTG GGCACT3CTC TGAAGATCTT ATXTl'CxaAa AAATTGATAG CAAATATGCC ISOO 

AGAAnOTGGA CCTAGTTATG AATTCCATCT AACCAGACMV GAAAXIbSZAT C3VITATTCAA 1560 

CX3CATTTGGA AGAATTTCTA CAAGTGIGAA AGa^ATTAGAA AACTTCAGGA ACTTGTTACA 1620 

GAATATTCAT TAAAGAAAACI AaGCIOATAT GTGCCTGTTT CTGGACAATG GAGGCGAAAG 1680 

Cn A gTGGA ATTT CATTCAAAGG CRTAATAGCA ATGACAGTCT TAAGQCAAAC ATTTTATATA 1740 

AAGnQCTTT TGTAAAGGAG AATTATATTG TTTTAAGTAA ACACATTTTT AAAAATTGrO 1800 

TTAAGTCIAT GTATAATACT ACTGTGAGTA AAAGTAATAC TTTAATA&TG TGGTAiCAAAT 1860 

TTXAAAGTTT AATATTGAAT AAAAGGAGGA TTATCAAATT CAIATATGAT AAAACTC^AAT 1920 

GTTCTAAGTC TCTCAAACTA GC56TTTTATG TAATAATATG TAATATAAAT AAAACTATGG 1980 

TAAAXGTGAC AAGCATTTAA TAGGAAAATG CTAAGQnGCC CTCATAARTG ACCCATAATT 2040 

ACCABOGTAG AATTTTTCftG TACATTTAGG GTTBCIGGAT TTAQCaUiATA AAAATAAAGA 2100 

TTGGCCAGIT AOATTTGAAT TTCAGATAAA CAATTAQTTT TTTAATATTT TACftSGGalkT 2160 

ATTTGGAAAA T&CTTATACT AAAAAATTAT TIGTTTGAAA TTObCArTTTA ACTGGG3U3TC 2220 

TTSrCArCTTTA TCTGGCRATC CTAAAATACA TTGGTATGRA ACAAATCACT TiTAfiAAaTA 22 BO 

TRTTGCTATT TTGH^TGaOT TGTTTTTGTG TGTAGAAACS ITiJCRATAACA ACTCAAAGGC 2340 

AGAGGAGATT TCTAAACATT GTGAAAAGIT GAATAGATTA TATATTTATT CTCATAAIFAC 2400 

TTTCACTAAT ACTAAATAAA ATTTGQOGAA CACTTTTTAT ^rtTXATATAA TTTCXZAATTT 2460 

A CAGA AAftST TTCAAAAATA GTACAAAGAG CTCTCTTACC CAGATTCnCT AATTSTTCMC 2520 

ACQTGCTTTA TCTTTCATGC TTTCrCrGTA CACACACACA CACACAC3AA TTTTT<2CTCA 2580 

^_ ATCATTTGAA AGltSUaTTAT A6GCATCATG C500CTTAAAC CCTAAATACT TCAGTGTGTA 2640 

Ot> ATACTGAATA ATTACTAAAA ATGArTTTTCT CRGAAAAAAA AACTCCCACA ATTCTGGAAC 2700 

TATAATACOXS TARGCCTTAG A&TAAATAAT ACrrTC3U«3r TCCAA3CTAA AGTTCTTTTT 2760 

GAQTTTTOTT GCOCGTTTTa TGCTT6ATQT GTATAGTAAT AGG&TftGGCT ATTTATTTTA 2820 

TTAAAATTTT TTTTAlGAGAC AAGGTTTTGC TI5K5TTQCCC AAQCTGGAAC T7GAACGACT 2BflO 

OGSOCTQAAGT GATCITCCCA GCTCSWSCCTC OC3UVSTAGCT GGGAATACAG GTQTCTGCCA 2940 

CCATACCCAG TTTCRXTTTT GTTTTTTATA OOOGAAGTTC AITTGCTTTG rCCTCCCXAAA 3000 

ACTGAACTGT AATTTTGGGA GGTTTTCATT AGIGGAAGCT CTTCATVTAT AAAGCTATTT 3060 

GAAGGGGTTT AGGAATT17AT ATCACATQGT AATTGTAGAG AAAAAGAAGC lATATACiCXC 3120 

AAAATOGTTGC CCTCTTTACA TATGTCTTAT CAGGTATAAC ATQTTGi^AAT OTCACRTTAG 3180 

T&6IAAAGTG GGGTTXATTT ATATAQTOGT TAAGAAATGT CAGTTTACAC TGCXGTATAC 3240 

TT CTTCT TCT GTGTCOCTAA GGCCTGGIAC ASlGCCSUUaC ACATACTTGG TATCCAATAA 3300 

ATA'TETOTTG GATGAAAAAA AAAAAAAAAA AAAA 3334 



Seq ID JSOt C51 DMA Sequence 
OU HucLeic Acid Accession NI«t_0028B8.l 
Coding Bequeccex 3 7.. 72 3 

1 11 21 31 41 51 

I I I I I I 
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CCAOGTCCGG GGTQCCGAGC CAACTTTCCT GCGTCCATQC AGCCCCGCCG GCRRCGGCTG 60 

CCOQCTCXZCT GGTCOGGGCC CAGGGGCCCG CBCCCCACOG CCCXX5CTGCT CGCGCTGCTQ 12 Q 

CTGTTGCTCG CCCaSQTQGC QGCGCCCQCQ G6QTCCGGGO GCCCXSaacOA CX:CTQC3GC3VG 160 

CCTCAOQATO CXGGGGTCCC GCGCftGGCTC CTGCAGCAGA AGGCGCGCGC GGCGCTTCAC 24 D 

TTCTTCAACr TCCGGTCCGG CTOOCXZCRiSC GCOCTaCGAjG TGCTGGCCGA C3GTGCAGGAG 3 00 

GGCOOCaCSGT GGATTBATCC AAAAGAGGGA TGTAAAGTTC RCGTC5GTCTT CAGCAX^AGAG 360 

CGCTAC&AjCC CRGRQTCTTT ACTTCMSGAA I3GTGAGGGAC GTTTGGGGAA ATGTTCTGCT 420 

□BAGrGTTTT TCAAfiAATCA QAAACCCA0A CCAACCATCA ATGTAACTTG TACACGGCTC 480 

ATCGAOAAAA AGAAAftGACA ACRAGAGGAT TACCTGCTTT ACAAGCAAAT GAAfiCAACTQ 540 

AAAAACOCCT TC5SAAATAGT CAGCATACCT GATAATCATG GACATATTGA TCCKTCTCTG 600 

AGACTCATCT OGGATXrGGC TTTCCTTGGA AGCTCTTAOB TQATGTGGGA AATC3ACAACA SSO 

CAGGTGTCAC ACTACTACTT GGCACAGCTC ACTA<3Te!i;GA G6CA6TGGGT AAGAAAAACC 720 

TGAAAATTAA CTTGTGCCAC AAGftSTTACSV ATCAAAGTOS TCTOCTTAaA CTQAATTCAT 780 

GTGAACrrCr AATTTCATAT CAAGAGTTGT AATCACATTT ATrrCAATAA ATATGTGAGT Q40 

TGCTGC B46 

Seq ID VOt C52 DNA Sequence 

NUcleic Acid Accession #i nH_005409,3 

Coding sequences 94..37B 



1 11 21 31 41 51 

I I 1 I t I 

TTOCTTTCAT GTTCACaCATT TCTACTOCTT CX:AAGAAGAG CAGCAAAGCT GAAGTA6CAG 60 

CAACRGCACC AGCRGCAACA GCRAAAAACA AACATGAGT6 TGAAG6GCAT CiGCTATAQCC 120 

TTGGCTGTGA TATTGTGTGC TACAQTTGTT CAAOGCTTCC CXaXGTTCAA AAGAGGAOGC 180 

TGTCTTTQCA TAGQCCCTGQ GGTAAAAGCA GT6AAASTGQ CAGMATXGA GAAAOCCTGC 240 

ATAATGTACC CAAGTAACAA CTGTGACAAA AXABAAGTOA TTATTACOCT GAAAGAAAAT 300 

AAftGGACA?iC GATGOCTAAA TCCCAAATOG AftGCAAGCRA GGCTTATflAT CSUWiAAAtSTT 3€0 

GAAAGAAAGA ATTTTTAAAA ATATCAAAAC ATATGAAGTG CTGQAAAAjQG GCATCTGAAA 420 

AACCTAGAAC AAQTTTAACT OTGACTACTG AAATGACAAG AATTCTACAG TAGSAAACTS 4 BO 

AQACTTTTCT ATGGTTTTGr GACTTTCAAC TTTVeXACAG TTATCTOAAG OATGiAAACaGT 540 

GGGTGAAAGS ACCAAAAACA GAAATACAST CTTCCTGAAT GAATGAC3AT CAGAATTCCA 600 

CTGO CCAAAG GAGTCCAGCA ATTAAATGGA TTTCXAOGAA AAaCTA<3CTT AAGAAAGGCT 660 

GGTTACCATC QOaGTrTACA AAGTGCTTTC AOGTTCTTAC TTGTTGTATT ATACATTCAT 720 

OCATnCTCTAG GCTAGAGAAC CTTCTAfiATT TGATGCTTAC AACTATTCTG TTCTGACTAT 780 

GAGAAC!ATTT CTGTCTCTAG AAGTTATCTG TCTGTATTGA TCTTTATGCT AXATTACIAT 840 

CTGTGGTTAC AGTOSAOACA TTGACATTAT TACTGOAGTC AAlGCCCTTAT AA6TCAAAAG SOO 

CATCTATGia TCOTAAAGCA TTCCTCRAAC ATTTTTTCAT GCAAATACAC ACTTCTTTCC 960 

CCAAATATCA TGTAGCACAT CAATATGTAG GQAAACATTC TTATGCaTCA TTTGGTTTGT 1020 

4U TTTATAACCa ATTCATTAAA TGTAATTCAT AAAATGTACT A3GAAAAAAA TTATAOaCTA 1080 

TGGGATACT6 GCAACA(3TGO ACATATTTCA TAACCAAASCT AGCA6CAC06 GTCTTMOTT 114 D 

GATGTTTTTC AACTTTTATT CATTGAGATG TTTXGAM3CA AT1»QGATAT GTOTOTTTAC 1200 

TGTACTTTTT GTTTTfiATCC OTTTGTATAA ATGATAGCAA TATCTTGGAC ACATTTGAAA 1260 

TACAAA ATGT T TTTG TCTAC CAAAGAAAAA TGlrTGAAAAA TAAGCAAATX3 TATACCTAGC 1320 

AATCACTTTT ACaTTTTTOTA ATTCTGTCTC TTAGAAAAAT AC3«AATCTA ATCaATTTCT 1330 

TTGTTCAT6C CTATATACXG TAAAATTTAG GXATACTOUL GACTAQTTTA AAGAATCAAA 1440 

QTCATTITTT TCTCTAATAA ACTAGCACAA CCTTTCTTTT TTAAAAAAAA AAA 1493 

Seq ID NOs CS3 E3fiiA Sequence 
OU HUcleic Acid Accession #: tnGENESH predicted 
Cbdlag sequence X 1..609 



1 11 21 31 41 SI 

I I I I I 1 

ATGCTGC3GGC AG6T6CTXCG aUSAOGaCTC CAGTCSSTTCT GCCACAGGCT QG6TTTGTGC 60 

GTGAGCOQGC ACCOGGTCXT TTTCCTCAOC GIGOCXSGCM TOCXGACAAT CAGCTTCSQQC 120 

CTCAGCGOGC TCAACCGCTT CCAfSOCOGAG GGOGACCTGG AGOGOCTG6T OSCTCCCAGC 180 

CACAGCC3Xaa CCaiAGATOGA GOGCAGCCTG GCCAJGC3W3CC TTTTCCCCCr GGAOaVOTCC 240 

A2UUUaCCnGC TCTArrrOGGA CTTACACACC C3CTGGGAGGT ATOGCAdSGGT GATCCTCCIC 300 

TCCXZCAACOa GGQACAATAT TTTGCTOC3U3 GCTGAGOQGA TGCtGCASAC GCACCOAGCC 360 

GTOCTGGAAA TGAftGGTGAA CCACAAGGOC TATAATTATA CTTTTTCCCA TCTOTOTGIG 420 

TTGA^AAATC AQGATAAOAA ATGOGTGCTG GATGATATTA TTTCAGVGCT AGAGeATCIC 480 

AGGCaOSCTG OOTTCTOCAA XAAGACAAC3V GCXZAaaOTOC AASTQAOOXA TOCCAACACT 540 

>rc AAATTAAAGG lATGCTOCTT CTQCKSGCCT CTGGCAATTA AAGftSGCAGC ACITCATTTC 600 

OD TTGCCCTAA g09 



8eq ZD NOi C54 mA Sequence 

NUdelc Acid AcceaeioQ ]ni(_00243B.l 

Cbding eeouencet 104.. 4474 

} 11 21 31 41 SI 

L ' ' 1 I 1 

GGGAACTTG6 ATTAG0TG<3A QAGGCS^iGTrG GGGGGCCTOG I'mT T ri GCG TCTTAGTTCC 6D 

GCOCICCTIQT OCATCAGGAG AAGGAAAGGA O^AAAOCCrGG GCCATGAGGC TAOCX:CrGCT 120 

CCTGGTTTTT GCCTCTGTCA TTCC3GGGTGC TGTTCTCCrA CTGGACAOCA QGCAATTTTT 180 

AATCTATAAT 6RAGATCACA AGCX3CTGC3Gr GGATGCAflTQ ASTCXXAGTQ COQTGCAAAC 240 

OGGAGCTTGC AACCAGGATG CXaSAATCACA GAAATTCOGA TGGGTGTCCG AATCT<3U3AT 300 

TATQAGTOTT GCATTTAAAT lATGCClGGG AGTGCX»rCA AAAACAGACT aOQITGCTAT 360 

C ACTCTC TAa: GCCIGTGACT CAAAAAGTGA ATTTCASUiA TOGGAGTGCA AAAATGACAC 420 

RCrTTTGaaO ATCAAAGGAG A?VGATTTATr TTTTAACTAC OaCAACAOAC AAiC5AA7kAQAA 480 

TATTATGCTC tCAiCAAGGGAT CX3GGTTTATQ GAGCAGGTGG AAGATCTATG GAACCACftGA 540 

CAATCTQTBC TCCAGAOGTT ATGAJ^CCAT GltAXACaCTA CTAOGCAATG CCSVATGOAGC 600 

AAOCTGTGCA TTCCCGTTCA AGrTTGAAAA CAAGTGGTAC GCAGATTGGA OGAGTGCTGS 660 

G06GIGGGAT GGATGGCTCT OGTGGGGAAC CACTACTOAC TATGACACAG ACAAGCTATT 720 
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TGQATATT6T CCATXGAAftT TTGIA6GGCAG TGAAAGCTTA TGOAATAAnS ACCCGCTOAC 780 

a«3CQTTTOC TACCRGATAA ACTCCRAATC OGCTTTAAOS TOGC^CCAAG CGA0GAAAAG 840 

CTOCCAACAA O^GAACGCTG AGCTCCTGayS OVTCACAGAG ATACATQAGC AAACftTACCT 900 

GACRGGATTA ACCAGTTCCT TGACCTCAGG ACTCTGGATT GGACTTAACA GTCT6AGCTT 960 

CRACRGCSGOT TGGCftGTOGA GTGACGGCftG TCCTTTCCGA TATTTOAACI GGTTACCAOG 1020 

AAGTCCATGA GCIGAACCTQ GAAAAAGCTG TGrGTCMTTA AATCCTGGAA AAAAIGCTAA ID 80 

ATGGGAAAAT CTGBAATGT6 TTCAGAAACT GOOCTATATT TGCAAA&AGS <3CAACACCAC 1140 

TTTAAATTCT TTTGTTATTC GCTCAGAAAS rGATGTGCCT ACTCACTOTC CTAGTCT^GTG 1200 

GTGGCCOTAT GCCGGTCACT GTTACAAGAT TCACAGAGAT GAGJUU^AAAA TCCAQAC3GGA 1250 

TC3CTCTGACC ACCTGCftfiGA AGGA2^QC5CGG TGaCCTOVCA AGTATCXUVCR CCATCGAGGA 1320 

ATXGGACTTT ATTATCTOCC AQCTAGGATA TGAGOCAAAT GACGAATTGT GQATCGQCTT ISaO 

AAATGACATT AAQA7TCAAA TGTACTTTGA GTOCSAGTGAT GGOACCCCTG TAACGTTTAC 1440 

CAAATGGCTT CGT06AQAAC CAAGCXATGA AAftCAACAGA CAGGAGGATT QTaTGOTGAT 1500 

« OaAAQGCAAG GATGGGTACT GGGCfiGATCG GGQCTOTQAQ TGGCCTCTTG GCTACATCTG 15 GO 

ID CAa^QATGAAA TChCGAAGCX: AAGGTCCAGA flATAGTGGAA GTOSAAAAAG QCTGCftGGAA 1620 

AGGCTOGAAA AAACATCftCT TTTACXGCTA TATOATTGOA CATAOGCTTT CTkACRTTTGC IGBO 

AGAAGCAAAC GAAACCTGTA ATAAT6AGAA TCCTTArTTA ACAACTATTG AAGAQWSATA 1740 

TGAACAAGCC TTCCTGACTA GmCGTTGG CTTAAGGCCT GARAAATATT TCTGGACAfiG IBOD 

ACTTTCAGAT ATACAAACCA A34GGGACTTT TCAGTQGACC ATCGRGGAAQ AGGTTCX3GTT 1B60 

CAOCXaurrGQ AATTCAGATA TGCCSVGGGCG AAAGCCAGGG TGTGTTGCXS^ TGAGRACCOQ 1920 

GATTGCAGGG GGClTaTGGG ATGTTTTGAA ATOTGATGAA AAQGCAAAAT TTCSTGTGCAA 1980 

GCACTGGGCA GAAGGAGTAA CCCACCX31CC GAAGCCCACG ACGACTCCCG AACCCAAATQ 2 040 

TCaSGAGGAT TGGGGCGCCA GCAGTAGAAC AAGCTTGTGT TTCAAGCTQT ATGCAAAAGG 2100 

AAAACAT6AQ AAGAAAACGT GGTTTGAATC TCQAGATTTT TGTCX3AGCTC tGOerGGASR 2160 

CTTAOCTAGC ATCAATAACA AAGACKAACA GCAAACAATA TGGGGATTAA TAACAGCTAG 2220 

TGGAAGCTAC CACRAACT6T TTTGSTTGGS ATTGACSVPAT 6GAAGCCCTT CAGAAGGTTT 2280 

TACTTOGAGT GATGGITCTC CTQTTTCATA TGAAAACTGG GCTTATGGAG AACCTAATAA 2340 

TTATCAAAAT GTTOAATACr GTGGTGAGCT CsiAftAGGTGAC CCTACTATGT CTTCGAATGA 2400 

TATTARTTQT GAACACCTTA ACAACTOQAT TTGCCAGATA CAAAAAGGAC AAACACCAAA 2460 

ACCTGAGCCA ACAGCAGCTC CTCAAGACAA TCCACCAGTT ACTGAAC9ATG GGTGGGTTAT 2S20 

TXACAAAOAC TACCAGrTATT ATTTC3USCZAA AGAGAAGGAA ACCATG6ACA ATGCGCGAQC 2580 

GTTTTGCAA6 A6GAATTTTG GTGATCTTGT TTCTATTCAA AGTGAAAOTG AAAAGAAGTT 2640 

TCTATGGAAA TATGTAAACA GAAATGATGC ACAGTCTGCA TATTTTAtTG QTTTATTGAT 2700 

CAGCTXGGAX AAAAACSTTTG CTTGGATGGA 1GGAAGCAAA GTGGArETACG TGrCTTGGGC 2760 

CACAGOTGAA CCGAATTTTG CAAATGAAGA TGAAAACTGT GTGACCAXGT ATTCAAATTC 2820 

AGGGTXrCGG AATQACKTTA ACTGTGGCTA TCX2AAACGCX: TTCATTTC3CC AGOGACATAA 2280 

CAGTAGTATC AATGCXACCA CASTTATGCC TACCATGCCC TGQGTCCCM CAQGOTGCAA 2940 

GGRflCGTTGS AATTTCTACA GCAACAA6TG TTTCAAAATC TTTGGATTTA TGGAAGAAGA 5000 

AAGAAAAAAT TGGCAASAGQ CACC3AAAAGC TTCTATAGGC TTTQGAfiCSGA ATCTGGTCTC 3060 

CATACAAAAT GAAAAAiGlAGC AAGCATrTCT TACCTATCAC ATGAAGGACT CCACTTTCRG 3120 

TQCC3QGACT GGGCTGAATQ ATGTCMVTTC AGZUVCACAGS TTCCTTTGQA CGGATGGACG 3180 

AGGAGTCQLT TACACAAACT GGGGGAAAGG TTACCCTGGT GGAAGAAiSAA GC3\GTCTTTC 3240 

TTATGAAGAT GCtGACTQTG TTGTTATTAT TGGAGGXGCA TCAAATOAAO CAGGAAAATC 3300 

GATGGATGAT ACCTGOGACA GTAAACGAdQ CTACATATGC CAGACAOGAT CC3SACCCTTC 3300 

CTTGACIAAT CCTCCAGCAA OGATTCAAAC AGATGGCTTT GTTAAATATG GCAAAAGCAG 3420 

CTATTCACTC AIGAGACIAAA AATTTCAATG GCATGAAGGG GAGACATACT QCAAOCTTCA 3480 

CAATTOOCTT AT2U3CCnQCA TTCTGGATCC CTAC3U5TAAT GCATTTGCGT GGCTGCAGAI 3S40 

QGAAACATCT AATGAACGIG TGTOQATCaC CCTGAACAGT AACTTGACTG ATAATCMTA 3600 

CACrTGGACr GATAAOTGGA GGGTGAGGXA CACEAACTGQ GCTGCTGATG AGOCXAAATT 3660 

GAAA^CAGCA TGTGTTTMC TGGATCTTGA TGGCTACTGG AAGACAGCAC ATTGCAATOA 3730 

AAGTTTTTAC TTTCTCTIGTA AAAGATCAGA TGAAATCCCT GCTACTQAAC CXJCCACAACT 3780 

<3CCTGeCAf3A TGCOOGGAGT CAGATCACAC AGCaVTGGATT CCTTTCCATG GTCACTGITA 3840 

CT^KPTGRG TCCTCATATA CAAGAAACTG GGGOCAAfiCrr TCTCTGGAAT GXCTTOGAAT 3900 

GOSTTCCICT CnOGTTTCCA TTGAAflSTQC TGCAGAATCC AGTTTTCTGT CATATOQQQT 3960 

J3 iXSAGCCaCTT AAAAGlAAAA CCAATTTTTG GATAGGAlTS TTCASAAATQ TTGAAGGGAC 4020 

GTOGCTQTGG ATAAATAACA GTCOGGTCIC CTTTGTCAAC TGGTiACACAG GftGATCCCTC 4080 

TOOTGAAOSG AAXGAa^TqTQ TAGCTTXACA TGCX5TCTTCT QGOTTTTCQA GTAATATTCA 4140 

CTGTTCTTCC TACAAAGGAT ATAITTGTAA AW3ACCRAAA ATTATTGATG CTAAAOCTAC 4200 

TCATQAATTA CTTACAACAA AAGCTGACAC AAGGAAGATG GACCCTTCTA AACCGTCTTC 4260 

CAAOGTGGCC GOAGTAGTCA TCAOTGITGAT CCTCCTGATT TTAAOGGGIG CTGGCCTTGC 4320 

OGCC^TTTC TTITAXAAdh AAaSACGTOT GCACCTACCT CAAGAGGGCG CCTTTGAAAA 4380 

CACTCTGTAT TTZAACAGTC A6ICAAGCCC 2^AACTAGT GATATGAAAS ATCTCGTGGG 4440 

CAATATTGAA CAGAATGAAC ACXC30GTCAT CTAGTACCTC AATGOGATTC IGAQAIATTT 4S00 

GRATTTCATA AAATTGTAAC TGAAATTTAA AATTTTTAGT TCAATGTOAT TGTTTECTTT 4S60 

AAA ATGfifi TA CTGAATIGIA CTOOTCTBTC CTTTTTTCCT TTGCCTAATT GAA6AAATAA 4620 

^CTGCrTGTTT TCIAGCCTGG CAAGATATTT TCaUTAAAAGA GGGATAACAA TGCTGATTAC 4680 

TAGCTTTtAA AATATTTTAG AtTAAATGCAC AGCAGCACAG CACGACATCT AAGCATTAOT 4740 

GATOGGTAGC TGATGTCAGC TTCATGTGGA TTITAAGCJAC TCTAilAAACA ATGAAGCTTC 4800 

TTQGCATATT TTAASGAGCT CCCAAAATQT GTTACCEATT AAATTGTAAC TCAQCAAGTA 4860 

QAAGAOQkTT TQAAAA6TCA GGTACAAAIT TGCTCAAGTG GCATAAAAAT GTAGTCAOTT 4920 

TTCTCTTTTA CdUSTITTTA TTTOCACTOC AATTATTFAG AACTTTATTT GTACATGTGC 4980 

AGAAGAAXAA GOCAGCTGAjS AATCTTGSTT CCXXX:AAQA0 AGTTTTAGAG GCXGAGlGTr 5040 

GCAAATGTGT TCTTIQTCCT GTTATATGTA ^TATCAQGAAT ACAAGGATOT GAAATAAAAC SlOO 

TGIAAATTTG CATAACTGGA TGTACTTAOA TAATGTGAAA TAAACATTAA AGACaUUSGXC 5160 

TATTTTTAAT AAAAAAAAAA AAAAA S1S5 
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Seq ID »Os CSS DNA Seqaence 
Muoleic Acid Accession «b nm_024574.2 
Coding sequencei 424.. 2130 

1 11 21 31 41 

1 I 1 I I 

AGJGCGACTA GGGGOGGCX^G GCGAAGCGGB GGOCAGCCCC 
OCGQCGGCCC AOCCCOCGCA CGCTQGCTGC AGTTTAAAAG GACCTOCGGC 
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C5GCTGCCCCG GGATTC(XCA GCCX3CGCG0G GCTCCCTACT OCJUnTTCGCA GCAACTTCG6 180 

CGACcacaOO CCGCCCXSCEC TOGOCCGOCT TTQRAOTTTG CTGTGCCCmC CGCAAAGTTG 240 

GGACACTTCA GOSGATTGAA TTTTTCTCTT TTATCTGCCT OCtSTCCCCGC CCTCCAGGCT 300 

5TCTC33TTCXT GGftTATTGQT GCTTASCATC TIQGC3GGGT OCGGOCJACGr GGACTATTTC 360 

GCACACCACA CCftCCSGGGAG GGATTTTTTT CTATTTTCCC TACGAAAAAC AGATCTTTTT 420 

AAGGATGGTG CTGCTCCRCT QGTGCCTGtJT GTGGCTCCTG 3TTCGRCTCA QCTCRAOgAC 480 

CCAGAAGTTA CCCaMXKGGG AlOftjaQAACrr TirTCAGATG CAGATCC3GGG ACAAGGCATT 540 

TTTTCATGAT TGGTCAGTAA TTCCAGATGG AGCTGAAATT AGCAGTTATC TCTTTAGAGA £00 

mTACftCCTAAA flGGTATITCT TTGTGQTTGA AGAAGACaAT ACTOCATTAT CAGTCACRGT 660 

QACGCOCTOT GATQCTOCTT TGGAGTGC5AA GCTGRGCCTC CAGGAGCTGC CASAGGACftfi 720 

GAQGGQGGAA GGCTCflGOTQ ATCTGGAACC TCTTGAGCAG CnGAAQCAGC AQATCATTAA 780 

T GAGGA AGGC ACTTQAGXTAT TCTCCTACAA AG13CAATGAT 6TTGAGTATT TTAIATOQTC 840 

TAQTTCKCCA TCCG6TTTAT ATCAGTTGGA TCTTCTTTCA ACAOAGAAAG ACACACATTT 900 

- ^ CAAAGTATAT GCCACCACAA CTCCAJaftATC TGATCAGCCA TACOCTGAGT TACCCTATGA 960 

ID COCAAGAGTA GKSGrrGfiCCT CACTGGGGOG CACCACGGTC ACTTTGOCCT 6GAAACCAAG 1020 

CCCCACTGCC TCTTTGCTGA AACRACXXAT TCAQTACTGT GTGGTCAK3^ AC3UUU3AGCA 1060 

CAATTTCAAA AGTCTCTGTQ CAGTGGAAGC AAAACTGAGT GCAGATQATG CTTTTATGAT 1140 

GGCACC3GAAA CXTGGTCTOG ACTTCAOOCC CTTTGACTTT GCCCACTTTG GATTTCCTTC X200 

on TOATAATTCA GGTAAAGRAC GCAGTTTCCA GGCAAAGCCT TCTCCAAAAC TGGGGCGTCA 12 60 

ZU TGTCTACTCC A6GCCCAAI3G TTOATATTCA GAAAATCTGC ATAOGAAACA AGAACATCTT 1320 

CACOGTCTCT GATCTGAAAC CCGACAGGCA GTACTAETTT GAOGTATTTG TGGXCAACAT l^BO 

CAACAQCAAC ATGAGCAOCG CTTATOTAGa TACCTTTGCJC AGGACCAAGG AAiSAAGCXM 1440 

ACAGAAGACA GrCGAGCTCA AAGATGGGAA GATAACAQAT GTAITTGTTA AAAGGAAGGG 1500 

AGCAAAGTTT CTAOGGTTTG ClCCAGTCTC TTCTCACCAA AAAGTCACCT TCTTTATTCA 1560 

ZD CTCTTOXCTG GAaCGCTQTCC AAATCX3iAGT GAGAAGAGAT GGOAAACTTC TTCTGTCTCA 1620 

GAATGTGGAA GGCATTCAGC AGXTTCflGCT TAGAGGAAAA CCTAAAGCTA AATACCTCX3T 1680 

TGG&CTGAAA GGAAACAAQA AAGGAGCATC TATGTTGAAA AaTCTAGCPA CCS^CAAGGCC 1740 

TACTAAGCAG TCATTTCCCT CTCTTCCTGA AQACACAAGA ATCAAAGCCT TTGACAAGCT 1800 

CCBTACCTGT TCCTCGGCCA CXHTOGCTTG GCTAGGCACT CAGGAAAGGA AGAAGTTTTG 1860 

JU CATCTACAAA AAAfSAAGTGG ATGATAACTA CAATOAAGAC CAGAAGAAAA GAGAGCAAAA 1920 

CCAATOTCTA GGACC7WGATA TAAQQAAGAA GTCAGAAAAG GTCCTClGTA AATATTTOCA 1980 

CAGT CAAAAC CTGCAGAAA6 CAGTGACCAC AGAAACAATT AAAGGTCTTC AGOCTGGCAA 204D 

ATCTTACCTG CTGGATGTTT ATGTCATAGG ACATGGQGGG CACTCTGTAA AGTATCAGAG 2100 

TAAGGITGTG AAAACTAGAA AGTTCTGTTA GTrACCTTCT TATAGAGATA TATTATGTAa 2160 

3D AACTCCAGGA 6GGACAITAA ATCACTTTAA GTATAAAClG ACTACTCXZCA CAGTTGAGAG 2220 

AAGTTGTGAC CTGTACTTGT ACTATGQAAG GAAGGATAXC AACGT6TGTA TATTGATBTT 2280 

TATATAAGTA ACXClTGAAa OAGACTTGrT GTAGCGTGOC GCATGGTACC TAGTGTGTGT 2340 

CIGATGOCGO TTGGTGTCRA AGATAGAGGG CTTCTT6AAG GAACTTGCCA TTCCTlGCTT 2400 

Af\ TGAOCACTGC ATGAACTGCT TCXAAATTAT TTTATTACCT AAAAATTTAA AATATGCCAT 24fiO 

W TCA TTGCAC A CACCCftCAAA TGCAAATCAT TCCTCTCTAT AGATGCTA«36 ATATATATAA 2520 

ATTATTTTAT AAATTCTTGT TTTAAATGTC AGTGTTTCSA IGATTOTAAA CTATTAAATT 2580 

i-x-ri-xCCT AT TAAAGTACAG ATCTAATCTA AGTATTATTA AGTTGAXAGC CCTCTABTCA 2640 

GTTATATTGC /rATTGTAAAT TCTTGTTTGT 7GAGTAAAAT OTTTAAA1AC TATATGTAOC 2700 

^ TCATGTACAA ABTTGACATA CATTATATTC ATGTACAXAA AATTAAAGAG ATTAGATTAT 2760 

4D ATACTGITAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 2808 

Seq ID KQ; C56 miA Sequence 
lauclolc Acid Aczce99lon ft: 80034228.1 
DU Coding seqaences 37^.. 142 2 

1 11 21 31 41 51 

I I 1 t 1 i 

ATCcesaaoT qstgaosocsa gaggctgggg tctccaggac cAAcrccrcx tcatcttcgt eo 

CTTCCTCftGC CTGCTCAATG TGAAGOOCCT GATCATGATT CACTTCCACT TAATAAATAA 120 

AGTGTTTACA AATCAGAATA ACTTTTAOAC AATATEAAGG TGGTAATCAT GAACAOAAAA 180 

GATTTTGTAG TXCTTCCATG GQGAAAACCT GGAAATTCTG TAAAGCTAAA ATATAGCAAT 240 

G*CAA1AATXA AAACAAAASir CTAAOATTTG AAGAGATAAT TTGCTTCAtSG ATTTTOATGG 300 

AAGGGAAIG CIAACTTXAA AAACCAGATT TOGGAGAAGT ACAAAAGAAA TAGAAATGCT 360 

OU CAAGAACTGC GAATGGAGAA AGTACAGITA GAGTTTGAGA ACCAA6AGAT GGAGAAOAAA 420 

CTGCAAOAAT TCOGBXCCAC AAOAAACAAA GAAAAGGAA6 ATAflBQAGTC AAGCGAGZ2VT 480 

T&CTGGAAAT CTGGAAAAGT GGGGRAATTG C3TCAATCAAT CATATATGAT GTCAOUUUVT 540 

AAAGGAAATG TTGTrAAfiTT TTCTQCTGGA AAAGrGAAAT TAAAATTGCT GAAGGAACftG €00 

AXTCAAGAGC CAOTGAAAOC AAlCAGTTAAT TATAAAATGO CAAATTCTTC A6AATGSXSAA 660 

OD AAACCCAAGA TAAATGGGAA AQTTTGTGGA CfiGTGTGAGA ACAAAGClGC TCTACTCGTA 720 

- TQCCTTGAAT GIGQAGAAGA TTATTGTTCA GGATGCTTTB CTAATGTTCA CCAGAAAGGG 780 

GCACTAAAGC TCX»CAGAAC AACTCITTTG CAGQCAAGAT CTCAAATATT ATTCAATGTA 840 

TTGGATQTT6 CCCATCAGTT TATAAAGGAO^ GTTAATCCAG ATGAACCOUk AGAGGAGAAT 900 

'7n AMCTcrawatt, aggaaacxsus kaaaattcaa cataaaccca AATcroTAcr tctccagagg seo 

/U AGOWGICTG AGGT2W3AAAT TACAACXSATG AAAAGAGCAC AACGTACAAA AOCAAQAAAG 1020 

AGTCTGTTGOr QTGAAGGGTC ATTCGATGAA GAAGCTTCTO CACAOTCCTT TCAGGAAGIG IDBO 

TTAAGTCAAT GGAGAACOSG AAATC31TGAT GACAACAAGA AACAGAKTTT ACAMCBGCA 1140 

GXAAAAGACT GATTOQAAGA ATGCGAAGXA CASACTAATC TGAAAATTTG GAGAGAACCA 1200 

CTTAAXATTG AACXTAAAGA AGACATTCTA TCCTATATGG AAAAATTATG GCTTAAAAAA 1260 

'-^ Ca\CAGGAQAA CTOCACAAGA GCAACITTTT AAATGCTACC AGATAOGTTC CCACATCCAC 1320 

ATGAAACCaC TGGT6ATGCA CSlGTOT TCTC AAAATGAAAA CGAtGAAGAT AGTGAOXSGTG 1380 

AOGAGAOCAA AOTACaACAC ACfiGCTCTTT TATTGCCAQT AQAAACATTA AACATAGAGA 1440 

GACCTGAAOC ATCTCTGAAG ATAQTCGAAC TGGATGATAC TIATGAAGAQ GAATTTGAAG 1500 

AAG CTGAAAA TATTSIGCCT TACAAAGTTA AATTAGCTGA TGCAQACAGT CAAOGAAGTX 1560 

G1GC1TTTCA TGATTGTCAG AAGAATAGCT TTCCATATGA AAArGGCATC CATCAACATC 1620 

AarOTTTTCGA TAAGGGAAAG AGAGACTTCT TAAATCTTTG TCroaOAAAC AGCTCTACTT 1680 

ATTATAAAGA TAATTCAAAA GGAGAAACTT CAAACACAGA TTTTGACRRC ATDGTGGATC 1740 

CTGATGTGTA TTCTTCTOAC ATTGAAAAAA TTGftGGAAAG CAOCXCCTTT GAAAGAAATT IBOO 

TAAAOGAGAA AAA3ATAGGT TTAGAAAGTA ATCAAA/iOTC TGATGATTCC TGIGTAICAC 1860 
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TTGAAftGCAPL GGACRCTTTG CTAGGTAGAG ATTTAGAAAA AGCTCCCATT GAXSGAGAAAT 1S20 

TATCTCMiGA CATCAAAGJyi, TCCTTGC3AAT TGAsSCAATCT GTATAAGAQG CCAAGCTTTG 1980 

AftGAATCAAA AACTACAARG TCA.TCACTGT TGTTACftASA AATAGCCTGC AGAAGTAAGC 2040 

CTATAACAAA ACAArATCAA GGACTTGAGA GATTCTTTAT TTTTSATACA AATQAAROAC 2100 

TCAACTTACT TCCTTCTCAT CGTTTAGAAT GCAAC&ATTC CAGTACEAGG ATTACACTTG 2160 

CAGGTCaGAA ATCACAGAGA CCTTCAACAG CftAATTTTCC ACTTTTCCAAC TCTOTTAAAG 2220 

MGCTCCAG TTOCCTTTCA TCCTCTCATC CTCGATCAAG AAGTOCAGCT QCTCAATCAT 2280 

TGCTTCTGAA ATTTCAGAAA TTGAATATAT TGATATTACT GACXaGAATS 2340 

^^^^^^ AGATGACACT ACTGATCAAC ATACTTTiW3A CRATTTQGAA AAftGftATTAC 2400 

AAGTGCTOAG ATCTCTIGCA GATACTTCAQ AAAaGCTTTA CAGCTTAACC TCAGAAGAGT 2460 

TCCCRGATTT CAGCAGCCAA TCACTQAATA TAAGTCAGAT TTCGACAOAT TTCCTTAAGA 2S20 

CCTCACATGT GAGGGGTCCC TGTGGAGTTG AGGAATTQAG CTGTTCTGGA ABAGATACCA 2S80 

AAATTCAGTC TTTGCrQTCA CTTTCTGAGA QCAGTACftfiA TOAQGAGCSAG GfiAGATTTTC 2640 

TCAACAASCA ACATGTCATC ACAC1MC3GT GGTCAAACTG rACTTAftaGA TTATTTGTrC 2700 

OCATXTTGTA CCCAGRQTAA ADCAAACAAC TOAGAAAAGT AAOCaAGTGA 2760 

AAGTGCTGGA GATTTTGATT ACTAATGTCT TIQATGTTTC AAG6CTACAA 2820 

ACTAATAAAA GTAAAATTAT AftOTTCAAAA AAATTTTTAA AAAAAAAAAT AAAAAA 2B76 



Seq ID C57 DHA Sequence 

Smdelc Acid AccseeslDZi HH 024687.1 

Coding aequtinces 13B..I7D6 ^ 

1 f r r r r 

AAAAACATGA TGACAACAAQ AAACAGAATT TRCAXGCSGC AOTAAAAGAC TCATTGGAAG 60 
AATGCSSAAGT ACAGACXAAT CTGAAAATTT GORGAGAACXr ACTTAATATT GAACTTAAAG 120 
AAGAMTCT ATCCTATATG OAAAAATTAT GGCXIAAAAA ACAGAGGAGA ACTCCACRA6 180 
AGCAACTTTO TAAAAMCTA TCASATRCGT TCCXaCATCC ACATGAAACC ACTQGXGATG 240 
CAmGTGTTC TCAAAATGRA AAOGATCAAQ ATAGTGATGG TGAGOftGACC AAflSTACAAC 500 
ACRCAGCTCT TTTATTGCCA GTAGAAACAT TAAACATAGA QAGACCTGAA C3CATCTCTUA 360 
AOATAGTiCnA ACTGGATGAT ACTTATGAAG AGGAAITTOA AGAAGCAGAA AATATTCTGC 420 

CTTACAAA6T TAAATTAGCT GATGCAGACA GTCAAOSAAG TTGTGCETTT CATGATTQTC 480 

^AAGAATAG OTTCCATAT BAAAATGGCa TOCATCRACft TCATSTTTTC GATAAGGGAA 540 
AGAOAGACTT CTTAAATCTT TGTCTGAGAA AGAGCTCiaiC TTATTATAAA GATAATTCAA 600 

AAGGAGAAAC TTCAAACACA GATTTTGACA ACATOaTGGA TCCTOATGTC TATTCTTClXS 660 

ACATT GAAAA AATTGAQQAA AGCACCTCCT TTGAAAGAAA TTTAAAGQAQ AAAAATATAG 720 

^t^iJS T'^^^^'^AA^e TCTGAMATT OCT6TGTATC ACTTGAAftGC AAGQACACTT 780 

TGCTAGGTAQ AGATTTAGAA AAflfiCTCCCA TTOAGGWS&A ATTATCTCAA GACKCCAAAG 840 

^^^J^ ATTQAGCRAT CTOXAXAAGA GGCXMGCTT TCAAGAATCA AAAACTACAA 900 

f^^TTACAA GAAATAGCCr QCAGAAGTAA GCCTATAACA AAACAATATC 960 

AAGGAOTTGA GAGATTCETT ATTTTrTQATA CAAATOAAAG ACTCAACTTA CTTCCTTCXC 1020 

ATDGTTTAGA ATG CAAC AAT TCCAGTACTA GGATTACftCT TCCAGAAOAC AG&C3AATGGA 1080 

TTCCAGAOCA TAOCTTAAGT GAMaaOCTG ATAATOCAAT TQTCEIGGGT GTTCTGC&Ga 1140 

OTGCTCAQAG TCCATCATCA AGTAGAAAAC AfiCAAAAGAT GGGTCAGAAA TC3W»GAGAC 12 00 

CTTCAACAGC AAATTTTCCA CTTTCCAACT CTQTTAAAGA AAGCTCCRGT TGCCTTTCAT 1260 

CCTCTCATCC TOSATCAAGA AGTGCAGCTQ CTCAATCATC ATCCTAfiAGCT GCTTCTGAAA 1320 

TOAATATATT GATATTACTG ACCaeftATGA QCTTTCCTTA GATGACACTA 1380 

CTGMCtoCT. TACTTTAG&C AAITTGGAAA AAGAftTTACA ACTGCTGAGa TCTCTTCCAG 1440 

ATACTTCSMSA AAAGCTrCAC AGCTTAACCT CWSAACAGTT CCCAGAlTTC AGCAGCCAAT 1500 

CACTGA^T AA6TCAGATT TGCACfiGATT TCCTTAAQAC CTCACATGTG AGGQQTCOCT 1560 

GTGGAGTTGA GQAATTGAGC T6TTCTGGAA GftfiATACCAA AATTCAGTCT TTGCTGTCAC 1520 

^^^STSSSiS Sff^^^^^AT GftOGAGGAGQ AAGATTTTCT CAACAAiJCAA CATGTCATCA 1680 

CACTACCGTG GTCAAAGRGT ACTTAAAGAT TAffTTOTTCA TTftCTGTTTC CATTTTeCAC 1740 

SS^JiiSiiE ««»««GWV ACCAaSTGAT TAOCIATCCA AGlGClGaAG IBQO 

CTAATGTCTT TWaTGTTTCR AQSCTACARA CTAATAAAAG TAAAATPAIA ISfiO 

ASTTCAAAAA AAAAAAAAAA AAAA J jjjj 

Seq ZD HOx C58 DMA Sequence 

nucleic Acid Acceeslon «s MM 005408.1 

Coding sequence: 76. _ 372 

1 V f V r r 

AAAAGGCC3QG OGQAACAGCC AGftGSliGCaVG AGRGGC3U»G AAACATTGlXS AAATCTCCAA 60 

arCT^UJCCr TCAACATGAA AGTCTCTGCA QTGCTTCTGT GCClGCTGCT CATOACSVGCA 120 

GCTTTmATC CCCAEGGACT TGCTCAGCCA GATCCACTCA AOGTCCCATC TACTTGCrQC 160 

JTCA^TTA QCfiGTAAGAA GMCTCCTTG CAGRGGClGA AGAQCTATGT GATCACCACC 240 

S^SSSfT^ CCCAOAAGOC TOTCftTCTTC AGAACCAAAC TGGGCAflGOA SATCTGTOCT 300 

^^CCWUuao AGAAOTGGGT CCaGRATTAT ATGRAACAOG TGGGCOGGAA JW5CTCRCACC 360 

CT^AOACrr GAACTCTGCT ACCC5CTACTG AAATCaAQCT GGAGTACGTG AAATGACTTT 420 

^^S^ TCTTCTATCC TTTGGAATAC TTCTAOCATA ATTTTCAAAT 480 

AQSATSCftTT CGGTTTTGTG ATTCAAAATG TACTATGTGT TAAGTAATAT TOGCTATTAT 540 

^CIGGTTTGG AGTTTATTTG A6MTGCTG ATCTTTTCIA AAGCaAGGGC 600 

TTCAGCAAGT AGOTTGCTGT CTCTARGCCC CCTTCOCTTC GRCTATORGC TCCTGGCAGr 660 

GGGTTTGTAl TCGGTTCCCA GGGGTTC3AGA QCATOCCTGr GOGAGTCATG OACATGARGQ 720 

SS^SS^ SltSSJ?^ ABAGCTCTTT GTGAATGTGA GGIGTTSCTA AATATGTTAT 780 

TCT^J^ TGAAW5CAAT AGTAGOACTC CIQACATTTT GCSW3AAAA3a. CATTTOATTT 840 



3eq TD HO; C59 DMA Sequence 

Nucleic Acid Accessiozi #s AKO 97746.1 

Oodlng sequence; 185*. 2224 



1233 



wo 03/042661 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



1 
I 

CTTTCATGAC 
CTTCAC3i.TAC 

attttocttt 

ACTGflJTGTTT 
AACTTOGCTA 
CAfiflOOACTC 
TJVAAGAGCCA 
GAAAATAATA 
TGTAACatSAC 
TTACTTDGAT 
TATGGCCAGC 
TATTTCACT<3 
AGAAGGAACI 
G6AAAAAATA 
GACAAGCTAT 
TAATQAACCT 
TTCTGATCCT 
TGATAGAATT 
AeTGACAGAC 
GTACATAGTT 
TAAAGGCACr 
TGAGATATTT 
CCtCXVXGMS 
TACTGATCSQ 
OGACATTGAA 
GTTAQTACAA 
OGAOCTTGAA 
CAfiTAQCTTA 
TAAGCCCCTS 
GTATAATTCA 
TTTAACTGGA 
ATATQAAXTT 
TA.TOCAGGGA 
AOUVTATCXC 
ATCTGGGATT 
AKSOAACTCTT 
AGAOCSACCX 
CTAAATTG<SA 



11 

\ 

AGTAACAAAT 
AACTTATATT 
TTATTATGTG 
CTTTTAACTG 
CAGGAC7VAAA 
AGGCAACATT 
CATAATGCTA 
ATTCTTCG6T 
AAACTAGGGA 
TCAAATTGCA 
CTGCTGAAAT 
GGACAOGGAC 
TCQGTGTGCC 
TOTGAAGATT 
CCftTCTTCAA 
CCCAOSGGTC 
QA6TTTTTCA 
CCATTTGAAG 
GATTOOGAlCA 
GAAAACCCTC 
TAT6Afi(3ACT 
GGATTACAT6 
TCCTTGCTCC 
ATT C TOT l 'Aa 
ATGGCACTAC 
GAAATOOAAA 
AAAGCTATTA 



21 

I 

CCAAGATTTT 
GTAATATATG 
CCAATCTTCT 
GAlGGAaTAAG 
GCTGGOAQGA 
TTTGTGAACA 
AATTTOCASC 
GTTTAAGACC 
AAAAGTTTGT 
CCATTCCCTT 
XTGCAAATGA 
AAICSGACCGAT 
TAC RgAMT TG 
TTACCTCTGA 
AATTCCCAGT 
TTOiSGCTQAA 
AGQGATGCXS 
CTATATCTTA 
QACGTCTTCT 
ATTATAAGTT 
ACATTGAATT 
AAAACG^GA 
TChiCCCaSGG 
AAATTACCAA 
GGftAGTATCC 
QATTTAACAA 
AGGGTGTGGT 



GGAAGTIACA 
GGAAAACCTT 
GCTAIGCAGA 
GAGQTTATCX; 
CTGTATCrCG 
AAACITCXQT 
ATAAAiSrOGG 
TCCACTAOGG 
ACTCGGCACI 
CMATTTATA 



TCACAGATTT 

aroTaTTTTQ 

ATTATGCCAG 
CATCTQATAC 
ATGGOGCAOG 
TTGACZCTSAT 
ATGCCTATGT 
GACATTCTAC 
GGATCAAS06 
AAACATGC3U^ 



31 
I 

GGAAAAGCGC 
CX3SATCACTA 
TCTGGCAAGG 
TCTTAAAAGT 
AATCTGTCQG 
TATATATGAA 
ACCAATGGAT 
TGATAAGATA 
AGAGCCrCCA 
AATTTTTQTT 
TAAATCTATG 
TGCftiGCAAAA 
OCATCTTGCA 
AACCIGTAAC 
AACAATTCTA 
TCTCCTTCAA 
TGGAAAGGAA 
CCTGACTGGO 
ATTAACCATG 
TTCTCCCAGT 
CflTTAASAAA 
CATCTCCAA0 
AQGCTOCAAA 
ASATATCCTC 
TGTGAGATAT 
TTTAATTATA 
TGTGATGGAT 
AATATGGGOC 
CCTAGCCX3GO 
aCTOTCAGST 
AAAATATACC 
ATCTGACACA 
CTGGGACOGA 
GCCCATCATA 




41 
1 

CTAOSATATT 
TTTGAGAAG6 
AAAGAGATTG 
□CTGABAAAA 
6CAAGTGAAT 
TGGCGAGAAA 
AAOAACCTAA 
ACCGCAGCTA 
OCA'FfTQATT 
CTATCIXXaG 
TCTGGAAATA 
ATGATTAAAG 

TCATCCTTTR 
CAGAATGGAG 
TCATATCTCA 
CTGTTATTTA 
GAaTGTAATr 
CTG6CTGACT 
QGAAACTATT 
CTTOCATTTA 
GATCTTCAAC 
CAGACftGSAQ 
AACZUU3CTCC 
GAAGAAAQCA 
ACTATACGTA 
TCTQCATTGG 
AAACGTTCAT 
TTGAACTTTT 
TTCTTTTTCA 
ACCCCTATTG 
TCACCAGAAG 
GAAAGTOOAT 
TGGATAAAAC 
TACAAOACAA 
ATTGCAATGr 
TTGCmGTC 



3eq[ ID NO: C60 DNA Sequence 
tTuclelc Acid Accefislon J027£l,l 
Oodjbog sequences 14,. 1159 



1 
1 

GAWPTCCGGT 
GCTCI6TGGC 
TGAGTTCTGG 
ACRGGAAGTC 
CCACATCCTT 
GSAGCAGGAG 
TGACGACTAC 
CTGTAIGC&C 

cnrcAQAococ 

CSTOCTOOCT 
CTCCQAQCS^G 
GGOGATCCAA 

cxSTOffiACCrr 

OCTGCTOGAC 
GTGCaCCCATG 
CrCTQAGTCC 

ca:£accagag 

GCAATTTQTG 

ccaiGhcclux; 

GACGGGCrcr 
GOVGGCTGAA 
CAUl'l'dCACT 
ACCACAGTCT 
GCATCTGGAG 
AAAAACCAGC 
CCC3CRI6CAC 
AAAATGGGAA 
TCAAAAAATC 
GQGATGQCGA 
TTBTTAAGAT 
QATTGAAAQT 
GGGXTGTGCA 
AQICCtAErTC 



11 
1 

6CCATGGCTG 
CXJU3GCACTG 
TGOCAAAiGGC 
rrGQQGRCATG 
AACAAGATGG 
TGCAACGTCC 
TTCCCCCTGa 
CTGGGCCXGX 
CIGCCQ^AAC 
GTGCTGCCGG 
CAATTC3CCCA 
GOCATGATTG 
CTGQITGGCGa 
ACGCTGCTGG 
QATGACAQCQ 
CACCTCTGCA 
GCAATQCTCC 
GAGGAGCACA 
TQCCAGGOCG 
GAOCTTTGAT 
GOQACCATGG 
AAGAAGCXrrC 

ggt8gaccac 
gacajccaoag 
gagctctgca 

CACTGCTTTA 
TCftAAGATTG 
GATACA6AAC 
AGCAATTCCC 
CCGGGCAAGC 
ATAAAAAGGG 
CCTAGGGTGA 



TCITTTAAAA 



21 
1 

AGTCACAOCT 
CTaCCTOGAC 
TGGAGCAAGC 
TGGGAGOCGA 
CCAAGGAGGC 
TCCCCTTGAA 
TCATCOACTA 
GCAAATGCCG 
CCCXGCGOGIA 
GGGCCCrCCA 
ITCCTCTCCC 
CCAAGOGTGC 

GCCGCATGCr 
CTGGOCCAAG 
TGTCOGTGAC 
AGGCCTGTGT 
OGCCXJCAGCT 

GAGAACTCAG 
TGACCAOQCP 
AGCrCCCACA 
OOGCCCCCAG 
CCCACITCXIA 
GOCTCCACAC 
CAGGACA006 
GATTTTACAG 
TCTAAAAGKT 
CAGCX3TAGTC 
TTTCXIOCrC 

TTCTATTTCr 
CSCATrOCTGT 
AAAAAAAAAA 



31 

! 

GCTGCAGTGG 
CSiCCTCATOC 
ATT6CAGTGC 
TGACCTAIGC 
CATTTTCCAQ 
GCTGCTCAT6 
CTTCCAOAAC 
GG»8CC»GAj6 
OCXTCTGCCA 
GGOGAGGCCT 
CTATTGCTOa 
GCTAGCTGTG 



CTQCTOCTQC 



GGGC3CAGCTG 
GrCGCC3GACA 
CAOCCAGGCrC 
TGGCTCXITGG 
GCTGAOOCTQ 
TGGGAGCKTG 
CTGTCCAGCT 
CTTCCCCTSC 
CCGCOCTCCT 
OCCTOTGTCB 
TCCTCTCTGG 
CTACCACOAC 
G6TTGAAGCT 
CTACTTQCAA 
AGACATCS^GA 
AAOGGTGOAC 
GAGATGCTCT 
GTCTTTCTGT 
GCTGTGATTT 
QTAATACAAT 
AAAAAAAAG5 



AGAOCOCXAG 
CAAGAGTGTQ 
GAGACGATGA 
CCCCAGTOCA 
G^GACCGACX 
OCAGAGCAOa 
GACCCTCXGC 
OGGOC7GACA 
CTCTGCAGGG 
GCAGTGOOCC 
GCCGAGCXSCT 
GTCIGaCGCt! 
06AGAATGGC 
GGQAACAGCA 
CTGGACAGOG 
oraOOCAGGG 
TOCAGOCCrC 
GCAAAGGAAA 



CAGCICCCTT 
GCCreGTCTG 
TGT6AGGCAC 
CTCCCAGGGC 
GAGCCOCGCX: 
TTCAAAATTC 
AKTTGTTAAO 
ACTGCSACGCP 
GCTGCTTQAG 
AAGGTGGACT 
ATCTGCIGAA 
GTCTGCACCA 
GAATTC 



51 

1 

TAAATGACCa 
ACAAGCTGTT 
AATACCAGQA 
ATCCTGAXCC 
TTCCTGCCTT 
TCTATGACAS 
ATGAACTACA 
TAACAAACTA 
TGACAAAGA6 
GAGCAGATCC 
ASTT1CAAQC 
CAGCAATIGA 
TOCCCATSTT 
GGCTTTGGCT 
TAAAAATGAC 
CTGATCCAGT 
TCftATCAATA 
A<a:GGiU3GAAG 
TTTATAATCT 
TTGCACCTCC 
CTCAACACCC 
AAACAAAAAC 
CCICaOOAAG 
CTAGTGATTT 
TGAATACXGT 
ACACTCTAC5G 
AGGCACTCTC 
AOCCAAOOCT 
TACAGGACra 
CTCAGGCCTT 
ATTTGCTAGG 
ATG6TGTTTA 
TGCTTGCTGA 
CAACTCAAAA 
GTQAAC3GTAA 
TGTTAAAAAC 
AGTTGQATQA 



51 
I 

TGCTTGOCCAC 
Oa»GGGCCC 
GGCATTGCCT 
AGOACaLTOGT 
GGAAGTTCCT 
ACCAAGTGCT 
CAAAGGGCAT 



IGGACAAGCT 
CACAGGATCT 
CTCTGATCAA 
AOQTGTGdOG 
ACTOOGTCKT 
TOGTCCTOOS 
TGCOGCGAGA 
GCGAGCAGGC 
AAAAGTGCAA 
GCXGGGftXGC 
TCCAGTGTAT 
AGCCAAGTGA 
TOSCCAGCrG 
CCICQGCnOT 
TCTCAGCTCA 



TGGGCrCAOG 
TCSU!AGCX3iC 
AGZ^AGAATAA 
TTAACHJl'lTT 
CTGGC3LTSAT 
AGCTATTGCT 
TCCAGATTTT 
AQCTCAQCTQ 
ATGCTAATAA 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

7B0 

840 

900 

560 

1020 

1080 

1140 

1200 

12 £0 

1320 

1380 

1440 

1500 

1S£0 

1620 

leao 

1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
22S6 



60 

120 

180 

240 

300 

3«0 

420 

480 

540 

600 

660 

720 

7B0 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

IBDO 

1860 

1920 

1980 

2026 



Seq ID HO& C61 DMA Sequence 



1234 



wo 03/042661 



PCTAJS02/36810 



Muclelc Acid Access im i> 
Coding BeQuence: 19 . , S53 



DM 139172.1 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



1 
I 

GGGGTtTTGGG 
CTGCftGGGCT 

ggtgrcatat: 
tgcctbcgtc 

AACSTGOGGA 
AGCTGCAGCA 
TTOCTGGCGG 
ACCAACSAAGA. 
GIGGAQGGA& 
QAaOAaSATT 



11 
I 

GAQGTGACAT 
OGGCAGACG6 
OGGACCGGGA 
TCCnC3TGCTG 
GGAAGCACAT 
TCTGCTTQTT 
6TCC6TGTGA 
OGCCQTCCAC 
GCAGGGAGGG 
AC3C3GQAGTCC 



21 
I 

GTTGGGCrrGT 
AAATGGAATC 
GAGCTGTGGG 
CTACCGCAAT 
GTGGGCGCTG 

CATGTCCAAG 
GGGCAGCGTG 
GGAAGGGAGG 
COGGGCiGACT 



31 
I 

GGGATCCCAG 
CAGGGATTCT 
GGCCAGGCOG 
GOGGTCTGCT 
GTCTGQACGT 
AAGCGCCGGO 
TOOOTCTOSC 
CCAGTCGCCC 
GAGGAGGGTG 
GCTCAATACA 



9eq ID NO: C62 DtJA Sequence 

Nucleic Acid ACCOSSlon #s NM_054023 . 

Codii^ pequencex 



1 

I 

OGGOACACTT 
ATTTTTKAAC 
GCIGGTGAGC 
OCTTCCTGTT 
ATTAAAGCTT 
GAAQTGTQTA 
6CTATCACAC 
TQCTOCZATC 
GTAQTGAGCX: 
AAAAAAAAAA 



11 
I 

TOTATGOCAA 
TOCTGAAAAA 

(3ACAA(3TTGG 
CTTCTGAAAA 
AATQACICTGO 
TXGGT6TGAC 
CTCCCTGCCT 
<3TGAAAAOGA 



21 
1 

GTGGAACCAC 
TATCCCAQAT 
GTASTTACTC 
CACC^TTACC 
CTCTGGGCAT 
GACCAGAQGC 
ATCAAGATAA 
GAA2\CCTGTT 
CSUU^TAAAGC 



31 
I 

T6GCTTGGTQ 
AACTGICATG 
TGCTACTGCC 
TCTG6ACAAC 
TTCTGTTCaAG 
TTCTGAA6CT 
AGAOCOGAGG 
CTAOCRATTA 
AATGAATACr 



41 
1 

OGCTGGGCCT 
TCTACCCATG 
CCATCGATAQ 
ACCAOCAlSOG 
GCA6CGGCCT 
ACGTOCTGCA 
IXJCTCTOCAA 
TCjrCCAAAfiA 
AOGAGACAGA 
GATACGGTG6 



41 

1 

GATTTTGCTA 
AAGCTGGTAA 
TTCCTCATCA 
ATTCTTCCCT 
CACCTTGTGG 
OTOAJUaAAAC 
TGGATGGGGA 
TAGATCAAAT 
AAAAAAAAAA 



51 
I 

GCTCCTGCTG 
GAGCTSTGAQ 
CCCCAACCrC 
TCCAGACJGAA 
CCTCCICCTG 
TATGCCCGGT 
GCACCQAGGG 
GTCCAGGGAT 
GGOCBAGGAA 
ACS 



51 
I 

GATTTTTCTG 
CTATCTTCCT 
ACAAAGTtSOC 
TTATGGATCC 
AGGGGCTAAG 
T6CTGGAGGC 
TGGAAGATGA 
GCOCTAAAAT 
AAAAAAAAAA 



Seq XD ZiOi C£3 TOSA Sequence 

inuclelc Acid Accession #± FCSEHESH predicted 
Coding aeguence i 1 . . 2874 

1 11 21 31 41 51 

\ \ \ \ ] \ 

ATaCGCCTGX CCTAXGCCTA TAAAAAlOGCT GAGACCCTAG CAGGCAGACA CACAAGCAGC 
TGGATGTOGA 6AGGAGCATA TCAGCBCaaGO AACACACGGG CRGCTGOACQ TCCAGAGGAA 
TGCACTGACA GAAACTGGCA TGCTGGCAGA ACAOGTGQAA TTTGGCXGGG GCAGTTGGAG 
GAGAGATGTT CAGATGTGTT CGGAGTTTCT TTCTTCIGGT GGGTTOGTGG TCTCGCTGGC 
TCAGGAQOGA AGClGCAlGA£! CITQ^CX3CCA GCCCAOGAAG GQGCTCCCAC AGTGCAGC3GG 
CAGGCTGAAG GGCTCCTCAA GTGOCGCCAG AGTGGOCSTC CAGGGAGAGG AGGOGCGOAO 
AGCGAOCGAG CGAOaOATGC CAGCATGCTG TCACCTCICA GTGCTGOCAX G06AAACTAC 
CCAACGTCCT CTAOCATCX^C TOCAAGAAGA TCCTACTCXC CAAOCGAAAT TGCTCACAAQ 
AiSTTACrCCT GCAGCCITCC AI3ACATGAAA ATCTCCATGG CAQAATCTGG CCCCTCCTTG 
GATAGCCTTG ACATTCTTGGA GGATOGGQAQ TCTQGGTCAC CATTTCTTGT GACTCATTTG 
TACTTTCIGG GSGXTGTCAC CACTGGGATG GAACAACXAG ATTTTGAAAC AGGACCAAAC 
ATATTTGATT TGCAGATTTA TGTGAAOQAT OACSGTTGGTG TCACAGACJCT GCAflGTCCXG 
ACTGTOCAGG TAACAGATGT GAAOGAGCCA OCTCAGTTTC AAGGCAACTT GGCASAAGAT 
CATCTCGBTO CAGAOCAQCC ACAXTTCAAT GCTCATAGTC ACAGGTAGST GAGGGTAGTG 
GGXACIGCKT TGGCCAGGCA CM3GCTTAGA TCX»GCATTG OTXCCGCCTT OCTGOGCACC 
TTCTQTOTTa TGGTGGGCAT GCAGTATTTC ClIGATTTCrC CCOCAAAQAS CITCAGAATG 
TCTGCTAATG GCACCCTCTT CTGCACAACA GAATTQOACT TTGAAGCAGG ACACAGAAQT 
TTCCS^TCTCA TC0TGGA£3GT GAGQGAClUXr GOAGGCdCA MGCCTCCRlC AGAGCTCCAG 
GTGAAGATOS TGAAGCICAA OGACGAAGTC OCTGGC7TXA CX3UQ0C0GAC AOGAOTGTAC 
ACBGTCCTGS AGGAACTCSnS TCCAGGAAtX! AXG6TGGCCA ATATCflCnSC GGRGGATCCT 
GAIIGATGAAG GTTTTOCC34G OCAGCTCCTC TACAGCATTA CGAdOTTAG CAAATAXTIC 
ATGATAAATC AGTTGACTGG TACAATCGAA GTOGCXXZAAA GGATAGAOOG AGATGCAGGT 
GAATTOAGAC AAAATCCCAC CATTTCCClG GAAGITCTAG TGUU3GACAG ACCATAIGGG 
GGTCAGGAGA ATGGCATCCA 6ATAACCXTC ATTGTOBAAiG AOSK»A06A CAATCCTOGC 
ACATOOCAAA A6TTCACCTT C3U3ATCCIU3T CTCCACCCTG CTCTSIQCXC CAAlSAGGCTG 
ACCTGGATG6 ATACOGTATT AGACrOTTTT CATGCTGCIG ATAAAGKCAT AGCTGTGACT 
GGGCDQATTTA CAAAAGAAAG AGGTTTAArr GGACTTACAa TTC!CACRTGG CTGGGGAAGC 
CTCACAATCA TGGCAGAAG6 CAAOGAGGAG CAAGTCACAT CTIACATGGA TGOCAGCAGG 
CAAAOAOKTA GAGCTTGTGT AGGGAAACTC CTCCTTATAA AGCCATCfti^ TCTCATGAGA 
CTTAGTCACT ATCAOGAW^ CAACTCAGQA AAGACXlXSCC GGCATGATTC CATTTCCICC 
TACCAGC7ICC CTCCCACAAC ATGTAiSQAAT TCAAGAATCC A0GCCACCZAA CAAGGAAGAC 
ACAAGCTCTG SrCACTGTTAC TGTGAACATC CTTGAAGAAA AlGAl^GAAAA GOC&ATTTGT 
ACTCCAAACrr CTXATTTOCT GGCCCTCCCA GTGGATCTGA AftGTTGGCAC AAAtATTCAG 
AATXXCAAGC TGACATGTAC GGAOCTTGAT TCCAGCCCXS^ GATCTTTCOG TTATTCCATT 
GGCOCAGQTA AOGTCAACAA TCATTTCftOC l^TCrCTCCCA ATGCTGGTTC CAA16XCACA 
GGCCIGCT6C TTACA-rClCS CTTT6ACIAT GCTGBTQaOT TTGAXAAGAT CTOGGACTAC 
AAGCTACTTG TCTAGGTAAC TGATGACAAC TTGATGTCrTG ACAGOAAGAA AGOOGAGGCT 
CTTGTTGAGA CJVSGAACAGT GACACTGAGT ATTAAAGTCA TTCGOCACOC AACCACEATC 
ATCACCACGA CCCOOMSQCC CAGGGTCACC TATCAGGTCC TGAGSAAAAA CGTXTACTCT 
CCATCIGCAT G6TA0GTGCC QTTTGTCATC ACZTTGGGCr CGATATTOCT TCTOaSTCTC 
CTOGTGTACXr TGGTOGXCCT ATTGGCCAAA GCXATCXT^CA GACACTGGCC CTGCAAGACT 
GGGAAGAACA AGOAACCTCT GACAAAGAAA GQAGAAACGA AGACXOCAGA GAGAGACGTC 
GTGGTGGA7VA CTATCCAGAT GAACACTATC TTTGATGGAG AAGOCATAGA TCC3W3AGOCT 
GAGCAAGCTT CACTCBAGCT CTATGCCCTG CTBCCCAGCT GCX6CGAC0C XAGTCCA6TA 
ACCCZAAGAA AGGTCCAGGT QTaTGGGGaG AGTGAAGAGA C0GGTCAGT6 TTCCGQCCAC 



60 

120 

180 

240 

300 

360 

420 

480 

S40 

593 



60 

120 

180 

240 

300 

360 

420 

480 

540 

550 



60 

120 

180 

240 

300 

3£0 

420 

4d0 

540 

£00 

660 

720 

780 

S40 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1330 

1440 

1500 

1560 

1620 

1680 

1740 

laoo 

1860 
1920 
1380 
2040 
2X00 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 



1235 



wo 03/042661 



PCTAJS02/36810 



ATCACACTTC CCGGCAAGKT TCCAOTCGAT GACCCftAGGA AACiU3QA2UU: AGGCCTGCAG 2760 
G<3TGATTTC3a AGGTCTGGAi: TCTATGCCCC GCTOTGAAG6 TGGITGTAQQ CAS^DCCTCAA 2B20 
GCTGAACGGT GCATTCGATT GGCTCTCAGT CTGAAAAAGT ACAOTTCT6A TTAA 2874 

Seq XD HQ: C64 Sequence 

Nbclelc Acid Accession 1^: XM_X68571.1 

Coding sequence: 155.. 388 * 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



1 
I 

TACACRGTCC 
CCTGATGATTS 
TTCATGATAA 
6GTC3AATTGA 
GGGGGTCAG6 
GCXJVCATOCC 



TTC3tRCTTCA 
GGCTCTGGGA 
GCOIGCAATA 
AATAAOGTCT 
AGGCCATOCT 
OCTGAGGAAA 
CICCATATTG 
CAGACACTGC 
GAASACTGCA 
AGAAjGCXSMTA 
AAAGTG6AAA 

TQCcccTwrac 

AGAAGATGAS 

CAAACCACAC 
TGC!CTCA(3CC 
TC3RJSGGGATT 
GAAAGGG(3TT 
ATT TCAGARC 
TCKSTTTCTT 
TAGGAATTCC 
CTGCCCrrCGX 
AAGTCACCFC 
AAQASTTGGT 
GQTSTCGGGA 
TC^USTTCTCr 
CTACACTTGG 
ACTACAGQCA 
ATCACCTAAG 
TAAAAAATAC 
AGGCTGAGGT 
GTGCCACTGfC 
AAGTQAATTA 



11 
\ 

TGOAOGAACT 
AAGGTTTTCC 
ATCAOTTGAC 
GACAAAATCC 
AGAATCGCAT 
AAAAGTTCAC 
ACCXAAACAA 
CCATGCCRTC 
AGATTGTGCT 
AATATACGGT 
AC6TTTATAT 
ATGTATTT6A 
AAOGTTTACT 
CTTCTGCSGTC 
CCCTGCAAGA 
GASAOaSACG 
GATCCAGTGA 
GATCCAiCTAA 
AOAOTCACTG 
CTGACIGGCA 
CTGQOCAACC 
CCAGGAAAGT 
TAAATTCTAT 
CAGACATCCA 
TGATCACATA 
TACCrGTGCT 
CTTTGCATTG 
TGACATAAAT 
TTTAACATCA 
ATIATCTTCTT 
CATCrOCSAAA 
TCAGCATCCT 
TCATGGTTAT 
6TGQC&SCTQ 
GGGCATGAT6 



21 
I 

GAGTCCAQGA 
CRGCC2\CCrC 
TGGTACAATC 
CACCATTTCC 
GCAOATAACC 
CTTCAGCATT 
OTTCTGCTTT 
TGGAGTGGGG 
GATTGQTQAT 
GATAATOCAG 
CCTAACAA6C 
TGTGTCAGAA 
CTCC3VTCTGC 
TOCTCGTGTA 
CTGGGAAOAA 

COGGGGAAAC 
CCCZAAATGCC 
CTGGGGAAGG 
AAGOSTGGGC 

AAACQGGGTC 
GGGGA^TGCTG 
GGGTCAAACA 
GTT6C6TGIT 
TCTQATAAGC 
ACTGCTAGGA 
AOTGAAGACT 
CCGW3ATTTC 
CAAAAGAAQC 
GAAGAAAACT 
GCATOTOAGA 
GTCTGGTTTC 
C^CAQATTT 
GCXCA'rGCCi 
CAAGATCAGC 



3X 
I 

ACCATOGTGG 
CTCTACAGCA 
CAAGTGGCCC 
CTGGAAfiTTC 
TTCATTGTGG 
ATGGTGCCGG 
GATGATGACA 
AGCOGCAGCA 
CTAGACTAOG 
GKKAGGATQ 
CCAGftAAATG 
AGAAGGCCOG 
ATGGTACQTG 
CCTGGTOGTC 
CAASQAACCT 
AACTATCCAG 
ATATGRATTC 
AAAATGGAAA 
GATGGGGXCA 
TOAGGATGCT 
AAAICCAGCC 
TAAGGAGGQQ 
TGGGCATGGT 
TGCGATGTTT 
CTGAAArtGAT 
AAGALrrUXTA 
AGCTCTATTC 
ATC3CTTA^T 
TTGRGTTATA 
AOAACAGTGA 
CAGTAGGCAC 
TTCATCCACG 

ArrcTA-rGAT 

TTTACTTTTA 
GTAATGCCAG 



4X 
1 

CCAATATCAC 
TTACCACTGT 
AAAGGATAGA 
TAGTGA/USCsA 
AAGACGXCAA 
AAAGAACAGC 
GTGAGGCAOC 
GATTTTTACA 
AAAATCCAAG 

AGTTTOCTCT 

CCCAGGGTCA 
CSCGTTTGTCA 
CTATTGGCCft 
CTQACAAAGA 
ATGAACACTA 
AACTCAAAAA 
tSAGTCCAGCC 
CTGAGAAGTG 



TTCAIGAACA 
CCTGTCAATC 
GTAGGG06GA 
GACAAATTTT 
ACAGGAACAT 
ACTTTGOGGT 
TGrCC3U3CA!r 
CTGGTTTCCA 
AATATGOCAT 
AAAAAACAGA 



TATATCACAA 
AAAAATATAC 
CACXTTOCCA 
TGGCAAAACC 



51 
I 

AGOGGAGGAT 
TAGCftAATAT 
CCGAGATGCA 
CAGACCATAT 
CGACAATGCT 
CAAGGOGAOa 
AAACAAGAGA 
GGATCXZAGCT 
TAACXTTAGCA 
TTACTATAAA 
CATTTTTGAT 
GCTATCAOGT 
TCACTTTGGG 
AAGCCATCCA 
AAGGAGAAAC 
TCTtETGATGG 
CTGQAGCCAG 
ACCAGGGAGC 
CCAACTGGQA 
CCAGAAATGA 
GGGCETACCC 
ACTGAGATGC 
AAATGTGGQC 
TAAACAAATA 
TTTCTATCAO 
GTGGAATTGT 
AGAAAGTTTG 
CCTTATTTTC 
ACACCTTTGT 
XGAGTAA6TT 
TTTTTCITBT 
TA6CAGTAGT 
ITTATCTATT 
TTAAAAGTGA 
AGGT^GObG 
CIGICTCTAC 



AAAAATTM3C 
AGGGAlSAACr 
ACrCTAGOCT 
CAACACT 



GCITAAACCr GAGAGGTGGA GGTTACAG76 AGTTGAC^TT 
GGGVGACAAA GCUGACTCC ATCFCASAAA AAAAAAATAA 



Se^ HI HO: C65 UNA Seqaence 

Hbc3Lelc Acid AccesBlon ffs 2SH_005266.3 

Coding sequences 122.. 1198 ^ 



1 
I 

ggcaggaggc 
acacaasggca 
gatgggggat 
ggtaggcaag 

TGCTGAGTCr 
CTGCCAQAAT 
GCAGATCATC 
GOGCATGCAG 
TGOCTCTTAC 
TGGAAG6ATT 
CACCACCATG 
CACCCTGCAT 
GCCCACAGAG 
GCTXnaCCTQ 
AOCXSOGGCAG 
CTGCACACCA 
CAATGOCTTC 
AGTACQAGGT 
GCCIGAGGXG 
CAAIBCaACBT 
CTOCTTTATG 
G^COCXTCTG 
TCTCAA1GAC 
AGGTGACAAC 
TTCAGATTAC 
GGAAGGGAXA 
GOGTTCrCTA 



I 

CATTTXCAAA 
GCAAGCACTG 
TGSAGCTTCC 
GTCTGGCTCA 
TCCIGGGGGG 
GTCTGCTACSe 
TTCGTCTCXSA 
GAOAAGCSCA 
GA6XACOC66 
GCGCTCCAGG 
GAGGTGGGCr 
GTCTGCOGCA 
AAGAATGTCT 
GCTGAACXCr 
CACATGaCTA 
COCCOOGACX 
AGCftATAATA 
CAGGAGCAGA 
OOCAATOSAG 
CTTAGTAAGG 
GGAGGATCAG 
AACTGATGCT 
GTTGCXCATT 

ccACGCaoAc 

TCATGAAAGA 
GCCAGAGGGA 
AGTTCCEACC 



21 
I 

CflGTCCCTOC 
GGAOACGAAA 
TGGGAAATTT 

ATOAGCAGGC 
AOCAGGCITT 
CGCCCTCTCT 
AGCTACGGGA 
TGQCAGAGAA 
GCACTCTGCT 
TCATTQTGGG 
GGAGTCCCTG 
TCATTGTCTT 
ACChCCXQOS 
AGTGGCAGCT 
TTAATCAGTO 
TGGCCTCOCA 
CIOCIGOGGA 
TCTCACCAG6 
CCRC5C3VSCAA 
GACCAGGTQG 
TTCrO^ClGT 
AATTCTAGAA 
TGCAGTTCCC 



TAGAKTGACX 
TCClTGAGCr 



31 
I 

IGGGAGAACA 
GTTTTGGCAT 
OCTGGAOGAA 
CATATTCdST 
TGA7TTCGGG 
CCCCATCTCC 
GGTGTACATG 
GGGGGAGAGG 
GGCAOAACia 
CUUCACCIAT 
CCAOTAdTTC 
TCCCCACC5CG 
TATGCTOGCT 
CTGGAaOAAG 



CCTGQAGAAT 
ACAAAACACA 
AGGTTTCATC 
TCRCOGOCTT 
GGCAAGGtCA 
GAAGAAAGGA 
CATCACTGCr 
ACTATAAGCA 
TCXZCCACCCr 
AAABAAGGGA 
CTCrCTCTAC 
GKTCAOCCTC 



41 

I 

CA0ACAGGCA 
CTGTTCCCTG 
GTACAOU^ 
ATGCTOGTOC 
TGTGA13VOGA 
CACATTCGCT 
GQOCACGCX3V 
GGCAAAGAGG 
TCCTGCTGGG 
GIGTGCA6CA 
ATCTAOGGAA 
GTCAACPGTr 
GTGGCIGC2KC 
ATOtfSACAGC 
TCTGTOOGCA 
QSCCCTGGGG 
6ACAACCTGG 
C3U3GTTC8TT 
CXXX^ATGGCT 
GAIGACCIAT 
GGCTCAGAGA 
TGGCTCCTrr 
GGGCTCTGOG 
CXACCCAGTA 



ATAOCAQCnS 
CCrOCTCCAA 



51 

1 

GAGGATTACA 
GCTGTGCCAA 
ACIOGACCXST 
TQGGCAOtOC 
TTCAGOCIGG 
ACTGOBTGCT 
TGCACACTGT 
TCO^GGGCTC 
AGGRA GGGjAA 
VCCTGATCXSB 
TCTTCCTGAC 

aostatccx:g 

GATTTGTCAA 
TAGTCCAGAG 
GAAAATTC^T 
TCACOGAGCA 
ATGGCCABAA 
ATGAIAGIGA 
CAjSTQTGACC 
GGAAAGACGT 

gagcxxx:ggg 

ATAGTAAGAG 
TACGAAGCCT 
AAGCTGGCCT 
CATAOCAAAT 
OOAAGAGCTC 



60 

120 

180 

240 

300 

360 

420 

480 

540 

fiOO 

€60 

720 

780 

B40 

900 

9fi0 

102O 

1080 

1140 

1200 

12fi0 

1320 

13 80 

144.0 

15DD 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

19B0 

2040 

2100 

2160 

2220 

2280 

2340 

2357 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1360 

1440 

1500 

1560 

1620 



1236 



wo 03/042661 



PCT/US02/36810 



AftACTTGCKA GCCAATABAC AQCATCAATC AAGGAACTTG OVrTATATGT QCTCTTOAAT 1680 

CTQTTGrCrC CATGC3ACCAT ICXZTCGGAGT AGTGGTGAGA TOGCCTTGQ6 TTGCCCTTGG 1740 

CTTC TCCTCC CTCTACTCAG CCTTAAAAAO GGCTTCTTtSCS AACTTTACCA OCAGCCTCAG 18 OO 

CTTTACAAAT QCCTTGGTAT GTACCTCTGG CAAATGCCCT G6TATGTACC TCTQGCAAAT 1660 

J GCCCCACCTT GGTGATGTTa CAACCTTTCC TTCTGCTAGG GTOTACACCT A(30CTGTGCA 1920 

GGTGTCftGCC CTGCTAGGGA GTCACTGTAC ACACAAACTC TACTGGAAIT CCXQCSCAACA 19&0 

TCTGrCAOCC TGCAGCTCXTT TTACAGTTCA ATCCAATGAT AGAAACCJ^TC CCTTOCCTTT 2040 

CTCCCTTGGC TQTTCACCCA GCCATTCCCT GAAG(»CTTA CCAACAOGAA TATCCAAGAA 2100 

GCTQTTQTCC TCTCTCGAAC CXTGACCAGA TCATCAGCCS^ CIGAGGCCAG TOGau^TTTCX: 2160 

lU CCAGGCCXTG TTAAAAAAAA AAAAAAAAAA. 2150 



15 



8eq ID HO: C66 DNA Sequ«nc« 
Kudelc Add Accession Us 1IH_014459. 
Ooding sequence: 738. .3407 



1 11 21 31 41 51 

I I I I I I 

GTAGRTGCAG TCOGCCGCC3G CKGCTGCXTTC AGCC3W3CAAT GCAAGATTAQ ATCTCTAAAT 60 

^ - GCAQCAAAAC ACTGCCTGAA AACAGACGGG CCOOCQCAGC AAGCAGACAT TTCAOGGTGC 120 

Z\) GCTGG6GAAG CTICAAAATA TATCTBTGAC TCTGTCrTCG tTTGCTCTTCA TCCCCATCAA ISO 

TTTCATCACX5 GGA6GOQAGC AGCAAGTAAO AATTTCACTT TC33aATClGC CIAOAGACAC 240 

ACCTOCCTGC TCCCrCCCCC ACTOGATGTG AAGAGTATTC CGGAGTCTCC GGGOGGGAGT 300 

AQATTTQCAQ CACCJZTAGCG GGAGCX^AOQA AAACCTACTG ATTCTITAGC TCATTAXCAT 360 

CTCTCCCAGA CGAGATTTCC TTCTTATOGC CTGCCTCATC GCPCAAOTTT QAGOCrCCCX3 42 0 

AAGTCOOeSC GOQAOAGACa AAACCCCTGG CICACCCOCA GCCGCAGGAA GCCACOGCCT 480 

TGCTCCAAGC OCCI6CAGCT CTGCI6CACC GOUSCTTCTC ACCCAQTaGS GA^CTGTAO 540 

ATCAACAGGT TCAGGGAACT TGAGCACSAAT AAGGAGA6AC CACOSGGO^ CGCAGCTCG6 600 

QTGCAGAaaa AAAAAAGGAC CCATAGACTT OTGGCTCGCG TCQOQCXSCGC ACGCTGC3GCC 660 

AGGGCOCCAG GCTGGCGCGC ACTCCCTCTC TGGCTCCTCC AGTCCCaATTG CTOCTGCCCC 720 

jU CACCTIACAQ aTCTOGOATG TACCTTTCCA TCTGTTGCrG CTTTCXTCTA TGGGCCCCTG 7 BO 

CCCTCACTCr CAAGAAGCTC AACTACTCGG TGCCGGAGGA GCAAOQaacC GGCACGQTQA 840 

TCGGGSUUavr COGOUaOGAT GCrCGACTGC AGCCTGGGCT TCC3SCCTGCA GAGOGOGGOG 900 

GCX3GAGGGC3G CAGCAAGTCXS GGTAGCTACC GGGTGCTGGA GAACTCCGCA CGGCSUOTGC 960 

TGOAaaraoA caacAGACAoc GGacrrccrrcT acaczcaagca gcgcatogac ogcgagtocx: 1020 

-5 J TGTGCOGOCA CAATGCXyUlG TGCC3\GCTGT CCXTTCGAGGT GTTCGCCaAC QACAAGSAQA lOfiO 

TCTGCATQAT CAAGOTAGAG ATCCAfiGACA TCAACBACAA CGCQCCCTCC TTCTOCTOSG 1140 

ACCACATGGA AATGSACATC TCGGASAAOG CTGCFGCGGG CACCX3GCTTC CXX3C!PCACCA 1200 

GCX3CACATGA CCtSCTACHCC GGGGAGAATG GGCTCGGCAC CTAOCTGCTC AOGOGOGAOG 1260 

ATCAOGGCCT CTTTGGACTG GAOGTTAAGT OCaSOGGOSA CGGCACX3UU3 TTCDCAGAAC 1320 

*pU TGGTCATCCA QAAGOCTCTB GACCGCQAaC AACAGAATCA OCATAiOGCTC GTGCT6ACIG 1380 

COCTOGAOGG TGGOGftGOCT CCAGGTTCOG CGAGCGTACA GATCSVAOGTO AASaTGATTG 1440 

ACrCCAACGA CAACAGCCCG QTCTTCBAGG OGOCATCCXA CTTGGTGGAA CTGCOCXSAGA 1500 

ACGCTCG(3Cr GGGTACfldSTG 6TCATCX3ATC TGAA06CCAC CGACX3CC3aAT QAAGGTCCCA IBfiO 

- - ATGGTGAAGT GCTCTACTCT TTCAOCAaCT ACGIQCCTOA C3CG06TGOGG GftGCTCTTCT 1620 

CCATOGACCC CAAGACCGGC CrAATCX3GTS TGAAGGGCAA TCTGGACTAT GAQGAAAACG 1680 

GGATGCIGGA GA7T6ACQTG CAGGOCOSAG AOCTGGGGOC TAACQCEAXC CCAGCCX3WCT 1740 

GCAAA6TCAC GGTCAAGCTC ATC3GAOCGC31 ACGACAATGC GOCGTOCTTC GGTTTOQTCT 1800 

GCGTGCGCX2A GGGOGCajCTG AGCGAGQCC3Q C0CCTCCCX30 CACCGTCA^C GCCCTGGTGC 1860 

GGGTCACTGA CCGGGACTCT GGCAAGAACG GACAGCTGCA GTGTCX36GTC CTAOGCXSaAG 1920 

GAGGGACMGG CGQCOOcaaa GQCSCTaQaca GGCXIoaGGGG TTCOSTOCCC TTCAAGCTTC 1980 

AQGAOAACTA OGACAACTTC TACAOSGTG6 TGACTGAGCG GCOGCTSOAC CGCSOAGACnC 204D 

AAfiACGACTA CAACGTSACC ATGGTGGOGC GGGA0SGGG6 CTCTOCTOCC CTCAACTCCA 2100 

QCAAQTOGTT CMOGATCAAG ATTCTRGAOG AGAAOGACAA CCX3aC3CTCa3G TTCACCA2UVG 2160 

_« GGCTCTAOGT GCTTCAQGT6 CAGGAGAACA ACATOCOGGG AGRGTACCTG GGCTCTCTCC 2220 

TOGCCCAGGA TCCOORGCTa GGCCAC^AACSG GCAGOGTATC CTACTCTATC CTGCCCTCGC 2280 

ACATGGGGGA GGTGTCIATC TACAOCtATG TGTCT6TGAA TOGCAOGSUkC GOGGOCATCr 2340 

A£3QCCCTGCS CTOCTCTTAAC TTOGAGCftGA OCAAGGCTTT TGAGTTCAAfi 6TGCITGCTA 2400 

AGGACTCGGG GGCGOCCGOG CACTTGGAGA GCAAOGOCAC GGTGAGGGTG ACAGTGCTAG 2460 

--^ AGGTOAATOA CftACQCGCXA GTGATOaTQC TCCCXaCGCT Q<2ftfiAAiC5GAC ACOGOGGAGC 2S20 

OU TGCAGGTGCC GCGCAAOGCT GGCdGGGCT ATCTGGTGAG CACTQmOBC GCCGTAOACa 2580 

G05ACTT06G OGAGAfiOSOB CXSXCTCACCT ACGUGA-XGG7 OGAGGQCIUU: GACGAGCAOC 2640 

TGTTTGAOAT OQACCCGTCC AGCGGOQAGA TCCGCACGCT GCAGCCXXTC 3X3GGASGAOG 2700 

TGAOGCCOGT GGTGGAGCIG GTGGTGAflOG TGACOGACCZA OOGCAAGCCT ACOCTGTCCG 2760 

- CAGXGGCCAA GCTCATCATC CSGCTCQOraA QCXJOATCCCT TCGCGASGGG GTACCACSeGG 2820 
OD TQAATGCCGA GCAGCftOCAC TGGGACATGT GGCTGOGGCT CATOOTOACT CTGAOCBVCTA 2860 

TCTCCATCAX GCTCCIAQQB GCCATBKTCA CCKTOGCCOT CAAC3!GCAA6 GGGGAGAACA 2940 

AGQAQATCCa? CACTTACAAC TGG0GGAT06 COG2U3TACA6 CCACCOGCAQ CTaOGTaQQa 3000 

GCAAGOGCAA GAAGAAGATUS ATCAACAAAA ATGATATCAT GCTGGXGCAG AGOGAAGTGG 3060 

«rv AGQAOAGGAA OGCCATGAAC GTCATGAACG TQCTGAGCAG CCOCTCOCTG aGCACCTCX3C 3120 

/U CCATG^TACTX OaACrAOCAG AGOOGCCTQC CKKTCAGCTC QCCCCGOTCX3 GAGGTGATGr 3180 

ATCTCAAACC GGCClCCAAC AACCPGRCTG TOCCTCAlQGG GCftOQOGGGC TGOCftCACCA 3240 

GCnCACC3GG ACAAGGGACT AATGCAA6CQ AflAOCCCTg C CSbCTCGGATG TCCATAATXC 3300 

AGACAOACAA TTTTCC3C5GCA GftGCCCAATT ACKTCGGCAG CAGGCAGCAG TTTOTtOUWV 3360 

GTAXTTCAGrr ACCTCCAOGT TTAAGGACCC AGAAAQAGCC AQCCTGAGAe ACASTGGGCA 3420 

' ^ CJGGaaRCAOT GATCRGGCXG ACAGTSACCA AGACACTAAC AAAGGCTCCT GCTGTGACAr 3480 

GTCTGTTAGG GAGGCACTCA AGATCSAAAAC TACTTCAACT AAAAGCC!!A2^ CACITGAACA 3540 

AOAACCAGAA GAGXGTGTTA ATTGCACAGA TGAAXGCCGA GTOCTTSOTC ATTCTOACAG 3600 

GTGCTGGATG CXJ^CAOTTCC CTGCAGCCAA TCSIGGCTGAA AATGCAGATT ACGGCACAAA 3660 

TCTCTTTGrA CCTACAGTTG AAGCTAATGT TGftGACTGAG ACTTACEAAA CTOTGAftTCC 3720 

oU CACTGGGAAA AflGACTTTTT GTACaMTlGG AAAAGACAAG OGAGAGCaCA CTATTCTCAT 3780 

TGCC3UU:X3TT AAACX:TTATT TAAAAGOCaw. ACGTOOCCTO AOOCCTCTCX: TOCSUUSAGGT 3840 

CCCXITCAaCA TCAAGCAGCC CAACCAAGGC GXGGATOSAG CCTTGCACCT CAAGAAAAGG 3900 

CTCCCTGGAT GGCTGTQAAB CAAAACCA&& AGOCCTGGCT OAAGCAAGCA GTCAGTACTT 3960 

GCCCftCTGAC ACTCAATATC TGTCACCTAQ TAAGCAA<XA AGAGACCXTC CCTTCATGGC 4020 

1237 



wo 03/042661 



PCTAJS02/36810 



TTCCQATCRG AXGGC&AGGG TCTTTGCAGA TGTGCftTTCC JW3AGCCA<5CC GGGATTCCAS 4000 

TQAGATOGGT GCTGTTCTTG AGO^SCTTGA CCACCCCftAC AGGGATCTGG QCAGAGAGTC 4140 

TOTGeftTOCA GAGGAAGTTG TGAGAGAAAT TGATAAGCTT TTGCAAOACT GCOGGtSGAAA 4200 

CGACCCT6TG GCTGTGA6AA AGTGAAAAAA GAAAAAAAAA AAGGCATTGG CATTTTCTTG 4260 

2> TCrCTTCTGT TGATTTAAAA ATCSATOCCTC CTGGTGATAA CCCATTTTAC AGGGATOAAG 4320 

AAAGACCAAT GCTGCmftA OGCITTTA6T GAACATCtGA A(3TGCOCACA ACTATGrTCT 4380 

TTCCACTGCT 6ATTTCTTTT TCAGAQATAA C3W.TGOTTTC GTXTIQACCA AACTTQTATT 4440 

AGGACAGAAT TAATGATGCT TAAAGAGAAA AGAAAAAAAG AGAGAA6AAA AAGGAGAGAT 4500 

* ^ GAAAAAGGAG GATOAGGAGA AGAATTACCT TTTGACAATC TGTTAGGAAQ QTATGCAGTG 4S60 

lU TGAGAACTGA AGTATTTCTG ATCACTCTCA QACTOTCCnC CGTGATrrAT GCTGAIZTTAA 4G20 

CTQTTTACCr ATAAACCCCA TACAAAGCAG GGTCATAATT TGTaATCTGT GGTGSATTTC 4680 

TAGCAGTCAT CACAGGCTTC TACTOAAAGT CCTOAAAAGA OCTTGCAGTA GTCCAAGCTA 4740 

CACCAAACAT TAACACATAT TTGTGC3TAAA CATTTCTGTA TAAAGTTACC TGT^aW^ACAT 4800 

^ ^ ATAAACACAA GGAAC3VITCC ATATrCATTAQ TOJAAAACAA AAACAAAAAA AAAACCITTG 4860 

1 J GTCATTTGTA AGACATCTCA TGTCATATAA AA6TTAAATQ TAAAAAGATA CAGTCCATTT 4520 

TGTCCTGCSIC ACAOGTAGAC TAATTCACGT CAAAAAAAAA AAAAAA 4966 



20 



60 



Seq ID HOr C67 UNA Sequence 

Nucleic Acid ACCesBion #: HM_0Q5601.2 

Coding eequence: 101.. 59B '~ 



Seq ID not C69 DMA Sequence 

nucleic Acid Accession #s m4.D0298S.2 

Coding sequence : 69 , . 344 



1 11 21 31 41 SI 

I I 1 I I I 

^ _ CCCAGGftGTC TGGGTGCACA GCCXCXTTTCX CTCTQAGATT CAAGAGTCTG ATCAGCAGCC 60 

ZD TCTTCCTCCT CCAGGACCCA QAAGCCCXGA GCTTATCCCC ATGGAGCTCT GCOHQTCCCT 120 

GOCGCTGCIG GGGQGCTCCC TGGGCCTGAOr GTICTGCCTG ATEGCITTCA GCAOC6ATIT IBO 

CTGQTTTOAa GCTGTGGGTC CCACCCACTC AGCTCACTOG GGCCTCTGGC CAACAGGGCA 240 

TGGQGACATC ATATC3«33Cr ACATCCACGT GACSC3U3ACC TTCRGCATTA TGGCTGTTCT 300 

GTGQGCCCTQ BTGTCOGTGA GCXTCXnTQGT CCTQTDCTOC TTOCCCTCAC TGTTCCCCCC 360 

JU AGOCCAOGGC (XGCTTGTCT CAACCMOSC AGOCTTTGCT GCAGCCATCT CCATGGTGGT 420 

GGGC3UX3GCS GTOTACAOCA GCGAGOGGlXS GGAOQ^BCCT CCACACOCCC AGATCCAGAC 480 

CTTCTTCrCC TGGTOCTTCT AOCTOGGCTa GGTCTCAGCr ATCCTCTTOC TCTGTAC3M3G 540 

TGOCCTGAGC CTGGGTGCTC ACTGTGGCQ6 TCCCCQTCCT GGCTATGAAA dCTTGrGAGC SOD 

AOAAGGCAAG AGCGGCAAGA TQAGTTTTGA 6C3GTTGTATT CCAAAGGCCT CMCTGGAGC 660 

CTOGGGAAAG TCTGGTCCTA CATTTGCCCG CCCTTGCASC CCTTCCCCAG CCCCTCCTCX 720 

TGTTTCTTCA TTCATTCAAC AAAATTTGC3C TGGAAAAAAA AAAAAAAAAA AAAAAAAAAA 780 

AAA >783 

40 Seq ID BQ: C68 DNA Sequence 

NUdeic Acid Accession #i H14J106433.2 
Coding sequences 12 9.. 566 

AfT } ^1 23. 31 41 51 

45 I I 1 I I I 

GTATCTGTQG TAAACCCAGT GACACGGGC9G AGATGAQOrA CAAAAAGGGC AGGACCTGA8 60 

AAAGATTAAG CTGC^GGCTC CClGCCnVTA AAACAGGGTG TGAAAGGCAT CTCAGCGGCT 120 

GCCC2CACCAT QGCTACCTGG GCCXTTOCTGC TCCTTGCftGC CATGCTCCTO GGCAACCCAG 180 

QTCTGGTCrX CTCTOGTCTG ACOOCTOftGT ACTACGACCT GGCAAGAGCC CAC?CTGC56TG 240 

I>U ATGAGQAOAA ATOCEGCCCQ TGCCTGCSCCC AGGAGGGCCC OCaGGGTGAC CTGTTGACCA 300 

AAACACAGSA GCTGQGCOGT GACTACAGGA CCXGICTOAC GATAGTCCAA AAAGTGAAGA 360 

AGAIGGTGSA TAAGCCCSkCC CnOAQAAOTG TTTCCRATGC TGCGACXiCGG GTQTQTAGGA 420 

OGGGGAGGTC ACGATflOOGC GAOGTCTGCA GAAATTMkT QAGGAGGTAT CAGXCTAGAG 480 

TTACCCAGGG CCTOGTGGCC GGAGAAACTQ CCX3iGCAGAT CTGTGAGGAC CtCAGGTTGT 540 

STATACCTTC TACAGGTGCC CTCTGAGOOC TCTCACCrTG TCCTQTQQAA GAAGCaCRGG 600 

CTOCTGTGCr dUSATOOCGG GAACCIGAGC AAGCTCTBCSC OGCICCTCOC TTOCTCGAXC 650 

caOAATCCAC TCTCCABTCT COCTOOCCIG ACTCCCICTG CTQTCCTCCC CTC7CACGAG 720 
AATAAAGTGT CABGCAAG 



738 



65 1 11 21 31 41 51 

I I 1 I ] I 

GCTGCASAQG ATTCCTGCAO ASGATCaUKa CRQCAC3STGG ACCTCJQCACa GCCTCTCCCA 60 

CRGGTAOCAT GAA<3GTCTCC GGGGCAGOCC TOGCTOTCTWT CCJCATTGCT ACTGCOCTCT 120 

„ GCGCTCCTGC ATCTGCCTCC CCATATTCCT OQGACACCAC ACCClGCTGC TTTGCXZTACA 180 

/U TTGOOOSDCC ACTGCCCCGT GGGCACATCA AGGAGTATTT CTACACCAGT GGCAAGTGCT 240 

CCAAOXAGG A6TGGTCTTT GTCACOOBAA AGAAOSCSCCA AGrGTOTacx: AACCCAGAGA 300 

AGAAATGOOT TCGGOAOTAC ATCAACTCTT TOGACSATOAQ CTAGGA1X3GA GAGTCCTTGA 360 

ACCTGAACTT ACACAAATTT GCCTGTTTCT GCTXGCTCTT GTCCTAGCTT GGGAGGCTTC 420 

^_ CX3CTCACTAT CCTAOOCCAC OCX3CTCCTTB AAOaaCCCRG ATTCTACCRC ACAGCRGCAG 480 

/ D TTACAAAAAC CITCC30CAGG CTOOACGTCSG TGGCTCACGC CTGTAATCCC AGCACTTTGQ S40 

OAGGCCAAGG TGGGTGGATC ACTTGafiGlC AGGAGTTOGA GAOCftGOCIO GCCtUkCKIQA 600 

TGAAACCCCA TCTCTACTAA AAATACAAAA AATXAGOOSG GC39TG6TAGC GGGCGCCTGT 660 

AGTCCX»GCT ACTCGGGAGG CTGAGGCAG6 AGAATOGOGT GAACCOGGGA GGCGGAQCTT 720 

OCAGTGAGCC GAGATOGGGC CACTGCSWZrC CAGCCTGGGC GACAGAOOOA GACTCOSTCT 780 

0\} CAAAAAAAAA AAAAAAAAAA AAAATACAAA AATTAGCCGQ GGGTGGTGGC CCACGOCIGT 840 

AATOCXMGCT ACTCGGGAGQ CCAAGQCAGG AAAATTGTTT GAACCCAGGA GGlGGAGGCT 900 

GCAGTGAGCr QAGATTGTGC CACTTCACTC CAGCCTGGGT QACAAAGTGA GACTGOSXCA 960 
CSkACAACAAC AACAAAAAGC TICCCCMCT AAAGCCTAGA AGftGC lT CJ G AGGOGCIGCT 
TTGTCAAAAG GAAGTCTCTA GGTTCTGAGC TCIGGCTTTG CXnTGGCTTT GCCAG6GCTC 



1020 
1080 



1238 



wo 03/042661 



PCT/US02/36810 



TGTGACCAGQ AftfiGAAGTCft GCATGCCTCT AGAQGCAAGG- AGGGGAOGAA CACTGCACTC 1140 
TTAACSCTTOC GCCGTCTCAA CCCCTCACAG GA6CTTACTG OCAA;vCA,TGA AAAATOOGCT 12 OD 
TAOCATEAAJV GTTCTCAATS AAAAIUU^ 1237 
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Seq ID NOs C70 DHA Sequence 

mideic Acid Accession A: nm_022154..2 

Coding sequences 13B1..1722 

1 11 21 31 41 SI 

i I 1 1 ] I 

AGTGTGCSTTT TAOTTTTTCC TAAGAJW3TGG OGTGGTTTGG GGCTTTATAT COGGG2«3GAG 60 

CATATOTACG CAAATOCTQC5 GGCGTTTGCA AlUZCCGGATC OGOG6CGTCT QGCC3CCATGC 120 

COGGCXX30GC QTTTGACtSGC TACTGCCACG CAGOGTTTCT tSGAGCCTGCC GGCTGGTGCC 180 

CX^GOTOacCT TTATCTCTGT OCCCCTTTGr CCTCTTTATC TQ^GGCXCTC CA^SGAGGCOG 240 

GGGGGOCCAC TCCGCCTATC GCTCCCCTOG GCTACGCTGC CACTCXnATG CCCOGCAGGT 300 

CGGOAOCTGC TGTTCTTTC6 AAGGCGOCGG AGAZ^CCAGGG GCGTCCCGC3G CCS^GCTCTGA 360 

CTOGGAGCftG CGCCGAQCAC THACGCTCOC GC5CCTTGGGC AAOOACOCXSV QTGCGCCXXaC 420 

GCGCGTOGCT CTGCGCGGCA GCCCG^OGGG GGCCCXCAAG GGGfkAGCOCA GGCCAGGATG 400 

GCC0CX3GGTC GCQCOQTGGC C3GGGCTCCTG TTGCTGGOGG CX3GC0QGCCT CJGGAQGftGTG S40 

GaaanGGGGC CAGGGCTAGC CITCAGOGAG GATGTGCEGA GCGTGTTOGG CGCGAATCTG €00 

AG0CTGXCX3G OGGCGCAGCT CCAGCACTTQ CTGGAGGAGA 7GGQAGCOGC CTOCOGOGTG 66 Q 

QGajrCCCOG AGCCTGGCCA GCTGCACTTC AACCRGTGTT TAACTGCTGA AGAGATCTTT 720 

TOCCTTCAT6 GCTTTTCAAA TGCTACCCRA ATAACCAQCT CCflAATTCTC TOTCATCrGT 780 

CCnSCAGTCT TACAOCftATT GAACTTTCAC GCKTGTGAGG ATGGGOCCAA GCACAAAACA 840 

AQA0CAA6TC ATTCAGAA6T OTGGQSATAT GGATTCCTBT (ASTGAOGAT TATTAATC3X3 900 

GCATCTCTCC TGGGATTQAT TTTGACTCCA CTGAIAAAGA AATCTTATTT OCCAAAGKTT 960 

TTGaCCTTTT TTGTGGG6CT GGCTATTGQG ACTCTTTTTX CAAAIGCAAT TTTOCAACTT 1020 

ATTCCAGAGG CATTTGGATT TGATCCCA?^ GTOGAC5VQTT ATOTTGAJSAA GGC3iGTTaCT 1080 

GTGTTTGOTS GATTTTACCT AC7TTTCTTT TTTGAA2^GAA TGCTAAAGAT 6TTATTAAAG 1140 

ACATATGGTC AGAATGGTCA TACCCACTTT GOAAATBATA^ACTTTGaTCC TCAAGAAAAA UOO 

ACTCATCAAC CTAAAGCATT ACCTGOCATC AATGGT6TGA CMGCTATGC AAATCCTOCT 1260 

GTCACAGAAG CTAATGGACT^ TATCCATTTT GATAATGTCA TBTOTaOTATC TCTACAGGAT 1320 

GGAAAAAAAG AGCX^VAGTTC ATGTACCTGT TTGaVAGGGGC CCAAACTGTC AGAAATAGG6 13 SO 

ACGAXTGOCr GGAIGJUTAAC GCICTGCQAT GGCXirCCAGA ATTTCATCGA IGaGCIOGGQ 1440 

ATtGGGGCTT CCTGCAOCTT GTCTCTOCTT CAGGGACTCA 6IACTTCCAT AGCAATOCXA 1500 

TGTQAGGAGT TTC0CCACX3A GTTAGGAGAC TTTQIGATGC TACTCAATGC AGaOATQAGC 1560 

ACTGGACAAG CCTTGCTATT CAACTTCCTT TCTGCATGTT CCTGCTATGT TGGGCTAGCT 1620 

TTTGGCATTT TCGTGGGCAA CaATTTCX5Crr CCRAATATTA TATTTQCAjCT TQOTaSAGQC 16 BO 

ATGTTCCICT ATATTTCTCT GGCAGATATG TTTCX»GAGA TGAATGATAT 6CTGAGAGAA 1740 

AAGQTAACTG GAAGAAAAAC OGATTTCACC TTCTTGATGA TTCAGAATGC TOQAATGTTA 18O0 

ACIGGATTCA CAGCCATTCT ACTCATTACC TTGTATGCAG GAGAAATCGA ATT6GAGTAA 1860 

TAQAAAAIGG AAGATGGTGX IGTTAATAAA GGCA.7TTAAT AOATAAAAAC ATCTCCAAAA 1920 

AGGATXTTGA AGCTGATCCT ATTTAGTTAA AAAGATAATT TTGCTTTCAA CTGTAGGTCC 1980 

AGAAAACTAA TTATTGGCAT CAGTCTGTQA AATAGTCCAT TATTTGTTOT T2yUUUkTaCT 2040 

TCaVftAAOGTT TTCAGrTGTCA GTCTGAGAXG CCIGGTATAT AGGAGCCTTI GGGAAATACT 2100 

TATTTTTCAG TATTCQUTGC ATATTAOATA TOUXATGAA <3CAAQA^^ACA TGCATTCTAT 2160 

AATCA3PGTAG A£3iCTCS«3AC TC3VGGGGAAA ATACAAGTTA TAT0CT6AAA GGCTTTAAAA 2220 

CTCTATGGTA GGATCAAAGA TTCAAATGOT TTCSGAGAGG TTTTATTTCA ATTAATTTGr 2280 

TCTAGIGCTT TCRAOAGCAA GTACATCAAA ATGTAGAAGG TAAAATGTAT GCAACACTAA 2340 

TTAXAAATTAT TCCRAGTCTT TAAGGMSOCA AAGAAAAAAA AfiZITTTCrCA CnSCXTTTTG 24D0 

TTCTGTTTTG TATTTCAATT ASGAACTTGC WSTATTATTT T6AAAACCAT TCTAAAATAA 2460 

TAGGAGTTM GAAATAAATA AAGTTTTGCT AfiCCCTGCTA AGTTCAGGCT TAGAGGCTTA 2520 

TOGCTAAGTW TAAACTTCAC CAOATTCCAC QAAAAeCTGO ATAGCTTTTT TTCTGACTTA 2580 

TGTTOXGGTT GCACCCCTCA CAAATGGCAG AACAGTATGT AAAGCIGGTA ACACCTOSGI 2640 

TTCABTSCAC CATGTGXITO CTTXaSOAAS CaTGAAGAA'XA TGTTGGTTTA GAGAAAGAAA 2700 

TTOSAITOSAA TTTTATGCTM TmCTSTTA AAGACAAAC& TaACTATTlA GCAGAGAATA 2760 

TTTTAATAAA TGCAAAACAA CAGCTGOACT GCTGTACAtC AAGOACAOAT TAACTGGAAA 282 0 

ACATATGITC CTTATOTaTG ATTGAGAGCC ATTCAGAAAA GACTTCCTTT GTGTTCAGCC 2880 

TATA'CTTTTC CATATQGTAT ACCTTGAAAA AAATTAfiCAC ACCATQGTTA TTTTTCTACC 2940 

TTTTATAAAA GACAGAGOCI aTTTACTCAT TTAeAAGATA GAGAAAATTG GTCTAAAATT 3000 

GAACATCXHA GATTCACACT OCCAAGTCAC TCAAG6TGAT TTGATGGltSA GGAAAATGAX 3060 

TQACAAAGCC CAACAAT6AT CTCAGOAATT ACATTTTCX2A AC^AGACCAAA AAATGTTTTC 3120 

ATGTAGGAGC AATGCRGATT TGGTGRATAT TTAATATATA TTTTAGrATG TATTTCACTT 3160 
TATGAC1X3AC AATTAAAAAA TATTGITIGG CCAAATAGXA AACACCCTTT TOAAACCATG 3240 
AAAAAA 3246 
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6eg ID HDi C71 DKA Sequence 

HTucIalc Acid AccesedoiL #i MV4_004X84.2 

CJoding aeguence: 186., 1603 

1 11 21 31 41 51 

I I I i I I 

CGAAAAAAGA GGGGAAOAGT ATCAAAGACC ATTTCTGGCT GGGCAGQGCA CrCTCAGCSG 60 

CTCAACTGCC CAGC8TGA0C AGTGGCCACC TCTGCaUSTGT CTTCCACAAC CrGGTCTTGA 120 

CTOGTCTGCT GAACAAATCC ^TCTGACCXCA GGCCOGCTGT GAACBTAGTT CCTOAGAGAT IBO 

AGCAAACATG CCCAACAOTG AGOC3CGCATC TCTQCTQGAQ CTGTTCAACA GCATCGOCAC 240 

ACAAOGGGAG CTOGTAAGGT CCCTCAAAGC GGGAAATOCG TCAAAGGATG AAATTGATTC 300 

TGCftGTAAAG ATGTTGGIGX CATTAAAAAT GAGCTACAAA GCIGCCXSOGG GOGAGGATTA 360 

CAAGGCTGAC TGTCCTCCAG GGAAC0C3USC ACCTACCAGT AATCATGGCC C3USAT6CCAC 420 

AGAAGCTQAA GAGGATTTiG TGGAGCCATG GACAGIACA6 ACAAGCAGTG CAAAAGGCAT 480 

AGACTAOGAT AAGCTC2ATTG TTCQGTTTGG AAOXAGTAAA ATTGACAAAS AGCTAATAAA S40 

OCQAATAGAG A£5AGOCACOG GOCAAAGAOC ACACCACTTC CTGCGCAGAQ GCATCTTCTT 600 



1239 
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10 



CXCACACaiSA GATATGAATC AGQTTCTTGA TGCCTATGAA AOTAAOAAGC CATTTXATCT 660 

GTA CaCGQGC CGGGQCCXXT CTTCTGRAGC AA.TGCATOTA GGTCACCTCA TTCCATTTAT 720 

TTTCACAAAG TQGCTCXy^GG ATGTATTTAA CGTGCXX:TTG GTCATCCAGA TGACGGATGA 780 

CGAQAAQTAT CTGTOGAACSG ACCTGACCCr GGACCAGGCC TATGGCGATG CTGTTGAQAA B40 

TGCCAAGGAC ATCATGGOCT GTOGCTTTQA CATCAilCAAG ACTTTCATAT TCTCT6ACCT 900 

GGACTAC31T6 GGGAnSAGCT CAGSTTTCTA CAAAAATGTC? GTGAA6AXTC AAAAGCftTQT 960 

TACCTTCA&C CAAGT6AAAG GCA.TTTTCX3G CTTCSUTTGAC AGOGACTGCA TTGGGAAGAT 1020 

CftOTTTTCCT GOCATCCA<?G CTGCTCCCTC CTTCaWSCAAC TCATTCCCAC AGATCTTCCG IDOO 

AGACAQGACE GATATCCAGT GCCTTATCCC ATGT<K:CATT GACCAGGATC CTTACTTTAQ 1140 

AATOACAAGG GACGTC2GCCC OCftGGATCXSG CTATCCIAAA CCASCXXTTGT TGCACTCCAC 1200 

CTTCrTCEXa G0CCTGCAO6 GC!GCCX3U3hC CAAAATGAGT GQCA6C6ACC CAAACTCCTC 1260 

CATCTTCCTC ACCrGACACOG CCAAtGCAfGAT CAAAACCAM3 GTCftATAAGC ATGOGTTTTC 1320 

TGGAQGGAGA GACACCRTCG AGOAOCACAG GCAGTTTGGG GGCAACTGTG ATGTGGACGT 13 BO 

QTCITTCATG TACCTQACCT TCTTCCTOSA GGACmOSAC AAGCTCGAGC AGATCRQGAA 1440 

iJ GGATTACACC AGCGGAGCCA TGCTCACCQG TGAGCTCAAG AAGGCACTCA TAGAQOTTCr 1500 

GCASCXXTTG ATCGCAOABC ACCAG6CC0G GCGCAAGGAQ <3TC3U3QGATG A6ATAGTGA& 1560 

AGAGTTCATS ACTCCCCGGA ASCrGTOCTT OGACTTTCRG TAGCACTCGT TTTACATATQ 1620 

CTTATRAAAG AAGTGATGTA TCAGTAATGT ATCAATAATC CCRQCCCAQT CAAAGCMCG 16B0 

CCACCTGIAG GCTTCTGTCT GATGGTAATT ACTGGGCCT6 GCCTCTGTAA GCCXGTGTAT 1740 

GTTATCAATA ClGTTTCCTC CT6TGAGTTC CAJTATTTCT ATCTCTTATG GGCAAAGCAT 1800 

TQTGGGTAAT TGGTGCTGGC ZAACATTGCl^ TGGTOGGATA GAGAAGTOCA GCTQTOaGTC 1860 

TCTCCCXMA GCAGCCXX31C AGTGGAGOCT TCJQGCTOGAA QTCCftTGGGC CACCCTGTTC 1920 

TTGTCCATGG AGGACTTCOG AGGGTTCCAA QTATACTCTT AAGACCCACT CTGTTTAAAA 1980 

^MATATTC TATGTATGCG TATATGGAAT TGAAATGTCA TXATTGTAAC CTAGAAAGT6 2040 

CTTTGAAATA TTOAT<?rGOG GAGGITTATT QAQCACAAGA TGTATTTCA6 CCCATGCCCC 2100 

CIGGC3UIAAA GAAATTGKTA ACZrAAAAOCT TCGTSKEACA TZlGACXAAe AAATCAOCCA 2160 

GCTTTAAAGC TGCTTTTAAC AATGAAGATT GaACAiOnOTT CAGC3VATTTT GATTAAATTA 2220 

AGACTTGGGG OTGAAACTTT CCRGTTTACT GAACTCCAGA OCATGCAIGT AGTCCRCTCC 2380 

AGAAATCATG CTOGCTTCCC TTGGCACACC AGTGTTCTCC TQCCAAATGA OCCTAGAOCC 2340 

TCIGTOCTGC AGAOTCAGGG TGGCTTTTCC CCTGACTGTG TCOGATGCCA AGGAGTCCTG 2400 

GOCTCCGCAG AIGCTXaiXT TTOACCCTTG OCTGCAGTGG AAGXCAOGAC AOAGCAGTGC 246D 

CCTGGCTCTG TCCTGOACGG GTGGACTTAG CTAGGGJUaAA AGTOGAGGCA GCAGOCCTCG 2520 

AGQCCCTCAC AGAIGXCXAQ GCAGGCCTCA TTTCATCACQ CAGGATGTaC AGGCCTGGAA 2580 

GAGCAAAGCC AAATCTCAGG GAAGTCCTTG OTTGATGTAT CTGGGTCTCC TCTGGAGCAC 2640 

TCTGCCCTCC TGTCACCCAG TAGAGTAAAT AAACTTCCTT QQCTCCTAAA AAA 2693 
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Seg ZD 13Ds C72 UNA Seqixesoce 

Nucleic Acid Acceaaion #s Nh_00493b.i 

Oodizig Eeqaence: 337.. 4632 



1 11 21 31 41 51 

11)111 

CGGAGQACAG CC3GGACOGAQ CCAAGGCCGG GGACTTTGTT CCXITCCRCOS AGGGGACTCG 60 

GCAACTCGCA GCOGCAGGGT CTGGCaSCCGG CGCXTTGGGAG GGATCTGOGC OCCCX2ACTCA 120 

4^ CTCCCTRGCT GTGTTCXCTC OGCOGCOCOG GCTAGTCTCC GOCaCTGGOG OCTATGGTGG 180 

QCCTCOGACA GGGCTOOGGA OOGACCQGGG GAGCTCGGAG GGGCCCQGGA. CTOOAQACnS 240 

A^TGCATQABG GGOCTAGOtaA GQOGCAGGAG GGSIGGIGAT aOTCTGGGAA GCGGAGCIGA 300 

AOTCCCCTGG GCTTTGGTGA GGOGlGACRQ TTTATCATGA COGTGTTCAG GCAGGAAAAC 360 

GTGGATGATT ACTAOGACAC CGGOGAGGAA CTTGGCAfiTG GACRaTTTGC GGTTGTGAAG 420 

AAATGCCGTG AGAAAA^AC 06GCCTCGA6 TATOCCaCCA AATTCATCAA GAAAAGGAG6 4B0 

ACTAAGTQCA OC0GGGGC3GG TGrOAOCCGC QVOGACATOe AfiCGGGnGOT CAOCATCCTO 540 

AAOOAGATCX: A6CACCCCAA TGTCATCAGC CTGCAOGAiGG TCTT^TGAGAA CAAGACX3GAC 6D0 

GTCATCCTGa TCTTGGAACT GGTTGCAGOT GGOSAGCTGT TTGACTTCTT AQCrOAAAAO 660 

GAATCTTTAA CTGAAGAOOA AGCAACTGAA TTTCTCAAAC AAATTCTTAA TG6TGTTTAC 720 

TACCTGCACT CCCTTO^T OG<XX:ACnT GATCTTAAGC CTGAGAAdCAT AATGCTITTQ 780 

GATASAAftga TCXXXIAAAOC TCGGATCAAG ATCRITGACT TiaGGTTGQC CCATAAAATT 840 

GACTTTGGAA ATtQAATTTAA AAACATATTT G6GACTCCAG AGTTXtSrOSC XCClGAfSATA 900 

GTCAACTATQ AACCTCTTQG TCTTGAGGCA GATATGTBOA QTATCQOOGT AATAACCTAT 960 

ATCCTOCTAA GTGGQGCXTTC CCCATTTCIT GQAOACACTA AGCAAGAAAC GTTAGCAAAT 1020 

GTATCCGCTG TCAACTAOGA ATTTQAGOAT GAATACTXCA GTAATACCM TQCCXmOCC 1080 

AAAHATTTCA TAAGAAOACT TCTC3GTCAA6 GATCXAAAOA ASAQAATQAC AATTCAAiGAT 1140 

AGTTT6CAGC ATCXCTGGAT CAAGCXSTAAA GATACACAAC AGGCACTTAG TAGAAAAGCA 1200 

TCAGCAQTAA ACATGSAGAA ATTCAAGAAG TTTQCA6COC GGAAAAAATG GAAACAATC3C 1260 

OTTCieCTTGA TATCACTQT6 CCAAAGATTA TCCROQTCAT TCCTGTCCAG AAGTA/VCATG 1320 

AGTGTIGOCA OAAGOGATGA TACTCTGGAT GAGGAAGACT OCTTTGTGAT GAAAGCCATC 1380 

ATCXATGCXyi TCAACXSATQA CAATGTOGCA G6CCTGCAGC ACCTTCTGQG CTCATTATCC 1440 

AACTATGAT6 TTAACX3VAOC CAACftflGCflC QGQACAOCTC CRTTACTCAT IGCTGCTGBC 1500 

TGTGeOAATA TTCAAAlACT ACAGTTGCTC ATTAAAAGAG GCTCX3AGAAT CX3ATGTCCAG 1S60 

GATAAGGGOG GGTCCAATGC OGTCTACXGQ GCTQCTCSBGC ATGGCCAOGT OGATAOCTTG 1620 

AAATTTCTCA GTGAGAACAA ATGCCCTTTG GATGTGAAAQ ACAAGTCTGO AQAGATGGCC 1680 

CrCCACGTGG CaGCTCQCTA TQGCCKIGtTC GAOGTGGCTC AAGTTftCITG TGCAGCnCG 1740 

GCTCAAATCC CAATATCCM OACAAAiGGAA GAAGAAACCC CGCTGCACI6 TGCTGCHTOG 1800 

CACGGCTATT ACTCTGTQGC CAAAGOCCTT TGTGAAQCOG GCTGTAAOGT GAACATCAAG IBfiO 

AACCGAGAAG GAGAGACX3CC OCTCCTGACA GCCTCTGCCA QQGGCTAOCA OGACATCOTQ 1920 

GAGTGTCTGG CCOAACATGG AGOCQACCTT AATGCTTGCG ACAAGOACGQ ACACATTGGC 19B0 

CTTCATCIGG CTGTAAC3ACQ GTGTCA^TG GnSGTAATCA. ASACTCTOCT CAGCCAA0GG 2Q40 

TGTTTCGTCG ATTATCRAGA CAGGCACGGC AATACTCCCC TCCATOIOGC ATOTAASUaAT 2100 

GQCAACATGC CTATCGlGGT GGCCCTCTGT GAAGCAAACT GCAATTTGGA CATCTCCAAC 2160 

AAGTATGQGC GAAOGCCTCT GCACCTTGCG QCCAACAACG GRATCCTAGA CGTGGTOCXSG 2220 

TATCTCTGTC TGATGGGAGC CAGOGTTGAa GOGCTQACCA CGGACGGAAA GAOOGCAGAA 2280 

GATCTTGCTA QATCGGAACA GCAOGAGCAC GTAGCAGGTC TCCTTOCAAG ACTTOGAAAG 2340 

GATAOC3CACC GAGOACrCTT CATCCAGCAG CTCCGACCca CACAGAACCT GCAGCCAAiGA 2400 

ATTAAGCTCA ASCTGTTTGG CCACTOGGGA TCCXSGGAAAA CCACCCTTGT AGAATCTCTC 2460 

AAGTOTGGGC TGCIQAGGAG CTlTTTCAGA AGGCSTOGGC GCAGACTGTC TTCCAOCAAC 2520 
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TCCAtGCAGGT 
AROCTOTACC 
GGTCTTACCA 

GATTICAGCX5 
GCTGCAAATG 

GAACCCATAa 
CACGCTGJWJA 
ACATCGXTQC 
CTGTTTGTTC 
CT6CAAGAAA. 
AAAA.TCATCT 
CTGCAGCAGT 
CTCAGGOGCA 
ACAGTTCAGG 

GACATCCAGC 
ATGGACATCT 
AAGACAGAC& 

6TCCAG6TGA 

GTCAACa«X5 
TGCT6CCTQC 
CCM3GGCTCC 
CCOSTOVTGA 
CTQAOCAACA 
CACGACGTCr 
CTCACTGGGA 



TACCCTGAtSA 
GCC3GCAQACC 
CAGGA/QGCCT 
GTAl^OCCaOT 
OGQTQA7CTA 
GCCi^jSGGGGA 
CCOTCTCATT 
CmOGGOBC 
CTCCAGXAOC 
TTTTTTAACT 
TQTAT^CTT 
6T2UGGAGAAA 
GTGOCTTTTG 



TQAOTTTTGT 
A6AAGG&AAT 
TTTCOGTTTG 
ATGCTTCATC 
ACCXATGTAA 
TGCATASCAG 
TAAAAGCAGA 
TAAATTGATA 
6TXATATCAT 
TTTTTACTTA 
TAAACTBTTa 



TCOCACCTTC 
CAGGCrrGCGA 
AAGQGATGCT 
AQTCCACCZAA 
TG1X3GGAGTT 
ATCCCAOQTC 
ACCCAGTGAT 

cxrrrcGGTGG 

TCATGAATGT 
TGAAAGAGAT 
TGGATGCTGQ 
TAOGAAGCCA 

CCaCGCTGCC 

ttgtgtaoga 
ttgctcrgca 

ACQTGCTGCT 
TOGAQACCCC 
GCCTCSGTGCC 
GOGCCCGGQA 
ACCTGCAC06 
TOGTGCCCQT 
ACCTQTGCOG 
TGAATGGCTG 
GCCAGGGCAT 
TQCTGGACTC 
T6A0GGTC3AA 
TCTACCAGOC 
CCATGGGGG6 
ACTCACAGGC 
OSAAACTGAG 
GCATGAACTT 
AGQATTTCXnr 
GCACAQTGGG 
TTTTGCTGAA 
AT<5GCICQAG 
GM96GCAGCC 
GOCCATCCTT 
TQCCACTCCT 
COGTTGTCTG 
TTGAAAAGCT 



ATCATATTGA 
ATACGCTTTT 
CTCXXaiGTA 
CA0AA6AGOG 
ACCTCirCTtST 
TTIGTATTGT 
CCTCCTCCAfi 

Cttttgttcc 

ATTCCCTCTC 
GAftTTQ\aaA 
ATOQTTTCCA 
AAATAAAGGA 
GGAGQGAACA 
ATBTATATAT 
ATCITATAAAA 
CTOGTTAAAA 



ACCCCTGGCr 
GAAOGTXSAGT 
GGAGGTGTTT 

ggocatcgac 
ctczggaaat 
aatccatgtt 

TTTCTCSGCTC 
CAAGCTGRAG 
TCCTCGACXSJ 
TAGGAACAGG 
GGCTTCTGGG 
GATTGTTTC36 
TTCCTG6AGG 
CGTGCAGGAC 
GCTCCACAGC 
GCXOGAOCCC 
ACGGGCGCTG 
OGACAGCGAC 
CCTOAGCAGC 

GGAACACCTC 
GTGGATCCAC 
CAAGCTGGCC 
TGAGGTCCAG 
GGTGTQOUSC 
GCATTACCIG 
ACOGGACTTC 
GTACAAGGAA 
CAGCXrrcGGC 
TCGCCTGCTG 
AGGCCTGCCT 
CCOCAGCCOC 
CAOCCTCATG 
GGCRTCCTCT 
CTGCAACAGC 
TCTGGCTTGO 
CCCTTTGGAG 
CCCTOCGGCT 
TGGATGGTCA 
AGTOTACXrrC 
CTGATAATTT 
TTGTCCTTTA 
CTGTGTTATA 
IGGAATCCCA 
TGTGTTTGAA 
CCCTTQCTGT 
TTTCTGACAG 
GOTGATTTTA 
AATGTCRATG 
ATCTCAGOTA 
GACCCCTGAC 
CAXTTAGATC 
AOCAAGTTTT 
TGTCC?yAATT 
ATATATGCAC 
TGTC6TTAAA 
AAAAAAAAAA 



TCTAAGCCCA 
GTGAGGAGCX: 
GTGGCCCC3QA 
ATCCAGAAC6 
CCTOTQTATT 



AACCCACTCC 
GCTGGAGGOG 
TTTQGAAATG 
TCAAAGGACA 
GTCTGTCCTC 
AA6CTCAATG 
CAGCTOAACC 
ACAGG06AGA 

CACCACTACC 
GT06AGGAGC 
GGC3AOCATGG 
GATGAGGAGQ 
ACCCCCTTOC 

AACOGTGGGG 
GTCCGTGGCC 
ACX^TTGAGA 
AGGCCOCAGC 
TTGCQQGCAC 
AGCITCAGCA 
ATGGACATCC 
GACOCGOCX^ 
GACCTOGTGG 
CTCCAGGOCC 
TCCAAACTGA 
OTC3TTCAAAA 
GGCACCTCri 
AOUSGGTCTG 
ATGCTGAGG6 
TQACCTGTTT 
TTGCftSTTTA 
CTCTCAGTOT 
TGCEGGAATT 

aaaaagaaaa 

ccatttcctc 
ctotatgatt 

ATCATCOGAa 
ATGCTGATCA 
TTTTTdJGTT 
TGATCAGTGT 
TI3AAGGICCA 
<?AA<3GTTGAC 
TCATCATTTG 
CTQGTTTCAT 
CTTOCATGAT 
CTTCTGTCCT 
TAT6TATATA 
AAiGTTGTTlG 



CAGTCTCAGT 
GCA6CATGAT 
CCCACCACCC 
CTTATTXGAA 
TCTGCT6TTA 
GTCTAGAAGA 
AOTCCCTTOT 
AAQTTGTCCT 
AGTTTGGATA 
ATCTTCACAT 
TGAAG6TACI 
CCATGACTCA 

GACXxauwrCA 

CX3CTGGCCAG 
TCAACATCAT 
OCACAAAOGT 
GGGGOCSCTA 
TQCTGCAQAT 
TGGAC6TCCC 
ACX5AGGTGAT 
CATSTGGCAT 
CAGAGGC306A 
CCGAGCTQCT 
TGGAOAOSGIA 
ACGTCATGGC 



QAGCATCAAC 
GTTOGAGCCG 
GCACTGCTCQ 
TGGAGTTGGC 
TGACTATTTT 
GCCCTATGAG 
CCCAGTTGAA 
GGTGGCCACC 
TQACAAAOAC 
TTCAAATAAG 
TGQAAATCAT 
CCIGTGT6AS 
GCTOATQTCG 
CGAOGAGGAC 
GCAAAGTGAA 
CCrrGGGGAAG 
CACCGTGGAG 
CCTCGATGCX: 
AGCCCTQATC 
GGTGTATGGT 
CTTTCACAAG 
C<30GGACATC 
GGroCTGCTG 
GAAGATCAA6 
CACCACGCTG 



AI3ACTC7GAA 
6CATCATGTG 
ATGCATCAOA 
RCCCCCTQGG 
CAAAGTAOMl 
TGCTGGGGGA 
OGOAGCTGOG 
TCAACCTGGA 
AlCAATTCCSUr 
TTTGGACTGC 

CTCTGCCX3CT 
AOAf^CAGAAC 
TTTGGACTCC 
CCTAACTTTT 
GT6CATATTT 
AGCTTATCTC 
TATAAACA^ 
TCAGCCAGGA 
TOGCCftSAGG 
-x-A-u'i-x'TGOCA 
TGTTGCTCTA 
CATGAAACCT 
ACAGTTGTA6 
TGGCAGTCCC 
AACTTOCTGT 
TTTAAATTQT 
QAGAAGCRTG 
CATATATATT 



GGAAACCTCA 
CrXQGGQTQT 
CCTGAACCTC 
GAAGGACTGG 
CACCAATAAC 
ATGGACCACC 
TOGCCGGGAT 
XGGC3^TGac 
TAGCTCTGTI 
A6AACCAAGS 
TOCAGCCACA 
ACCTCCCTCC 
AQATCmTA 
ATCrCrCATC 
CAATGACATT 
ATCCAAAZVTG 
TTTTATATTT 
CAATATGTGA 
GCTGTCACCA 
TGCTTCACCC 
AGGAAAGGG6 
OGAAGACATT 
ACACACTGTC 
GGTTACAGA8 
TTATAA'^FQG 
ACTTGAAGTC 
QATCGA03TT 
TAATGTTAAT 
AATACTOGTA 
TTTTTATAAA 



Seq ID KOi C73 DNA Sequence 

Nucleic Acid Accesaioa #s NH_a020Bl.l 

Coding sequence I 222..LS5d 



1 
1 

GGCTGCCOGA 

GGCTTTTerr 

OQQQACCTTQ 
AGAGGOGCX3G 
GCTGGTGQCT 
GCAAGAGCOG 



CdGCXGCAC 
Ca30GCTGC3G 
TOGATGACCA 
COGGQGCCTT 
AGCTQGGCCT 



ACTACCTGGA 
GAGAGCTGCG 
TGGGOSTGac 
a3AGAGCTGT 
CCICCCCIOA 
AGGCCGAGTG 
CATaSGGTGT 



11 
I 

GOGAGCQTTC 
GICTCCQCCT 
GCTCTGOCCT 
GGGGGTGGGC 
GCTATGTQCG 
GAQCTGOSac 
CCnSOCGGAG 
CAGCGAGATG 
OGAGAGCAGC 
CTTCCAGCnC 
CGGAOAGCrG 
6TACTAC06C 
aCTGGnOGQC 
CTGCCTCGGC 
CCTGCGGGCC 
CAGCQAC6TG 
CATOAAGCTG 
CTATTGGCGA 
GAOSAACCTC 
OGAGAGTC3TC 



2X 
I 

G6ACCT0GCA 

GCTcaacGoc 

TOGOSGGOGG 
GGGGGCaCCG 
GCCGCAGOGC 

ATCTCSGGGTG 
QAOGAGAACC 

OGOGTCCTGC 

cracTGAAoa 

TACAOGCAGA 
GGTGCCAACC 
CTCTrCAAGC 
AAGCAGGCCQ 
ACOCGTGCCT 
GTGCGQAAAG 
QTCTACXGTG 
AATGTGCTCA 
CTOGACTCCA 
ATC3GGCRG0G 



31 
1 

CC0CGCX3CGC 
CGCCQCCTCT 
GAACTGOSCA 
CCGGCCCCDC 
TGGTCGCCIG 
AGATCTA06G 
AGCAOCXGCa 
TQGCCAACOG 
AGGCCATGCT 
ACTCQGAGCG 
AOSCGAGGGC 
TGCACCZGOA 
AGCTGCAOCG 
AOGOGCTGCXa 
TCGTCGCTGC 
TGGCrCAGGT 



AGGGCTGOCX 
TG6TGCZCAT 
TGCACAGOTG 



41 

I 

GOCaOGCOQC 
GGACOGGGAG 
G6ACCCGGCC 
CATGGAGCTC 
CGCXXX3CaQQ 
AGOCAAGGGC 
GKXCIGTCGC 
CAGGCATGOC 
TGCCACCCAIQ 
GACGCTGCAG 
CTTCGGOGAC 
GGRGACGCTQ 
CCAGCTQCTS 
GCCCTTCGGG 
TOSCTCCTTT 
CCCGCTGGGC 
GG6AGTCCCC 
TGCCAAOCAG 
CAOGG3UCAAG 
GCTQGCXSGAG 



GGG06C0GCC 



A0GATCGG9U} 
OGGGCCCOAQ 
GACCCGGCCA 

arrcAocxTrQA 

CAGGGCIACA 
GAGCTGOAGA 
CTGCGCAGCT 
GCCACCITCC 
CTGTACTCAdS 
GCOGAGTTCT 
CTGOCTCSATG 
GAGGCCCOGA 
GTaCAGGGCC 
COGGAGTGCI 
GGC3GCG)fiGC 
G006ACCTG6 
TICTGGGGTA 
GCXaXCSkACQ 



25 BO 
2640 
2-700 
2-760 
2820 
2680 
2940 
3000 
3060 
3120 
3180 
3240 
33 00 
33 60 
3420 
3480 
3540 
3600 
3G€0 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4BO0 
486Q 
4920 
4980 
5040 
5100 
5160 
S220 
52B0 
5340 
5400 
5460 
5520 
5580 
5640 
57O0 
5V60 
5820 
SS80 
59X0 



60 

120 

180 

240 

300 

3«0 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 



1241 
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10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



OCCTCCAGGA. 
AGOTCAACCC 
GGGAGAGGCC 
GCGACGTCCA 
TQRGCACTGC 
A06TCATGG6 
CCAAGCCGGA 
TGCXSCAGCGC 
QCTOGQGCAG 
GCICCASCTC 

TCCTGGCCCT 
GAGGGCAAGG 
TGGAOAIGGOC 
GTOOCAGCCC 
CrUBQZCAiQCT 
TCOGGCTGCC 
CTACAOAGOA. 
CGCCTCCTCC 
CCTCCS^GAGA. 
TCTG3UGATG2^ 



GGAGTCTGAG 
GGOGCXACTG 
GOAGGCAGOG 
GOCTTGCIGG 
CCTGCTC!CCA 
CAGGGCTCAG 
CCCCGCAJCTG 
GCACOGGGAC 
TGXCCTTGIT 
CaCCTTGGAC 
CTGGACGQGC 
TGTGGTGTTG 
XCCTGAACOG 
CGCftCAGTGS 
CTGACTTTAG 

ccx:tgccagt 

CAGCACTCCC 
TCZCTGGAAG 

cxmrccTCCA 

TGCTTOTATG 



ChACAGGGAC 
CCAGC3GCCCT 
ACCTTCAGGC 
GGACTTCTGG 
CAGTGATGMC 
TGIUZGGCCTG 
CATGACCATC 
CTACAACGGC 
OGGTGAnGGC 
CC3GGA0GCCC 
GGCTGCCAGC: 
TACfcGTAGCC 
ACTGACTTTG 
TGGGGTOGGA 
CAOGCCTOGC 
GGOAOCCAGT 
TAGCCCTCOC 
GGC5CTCAAAG 
CACrGGGACr 
AGCCCCGCAC 
TGCATGATGC 
QAGGGGCCCC 
GACTGTCCTC 
ACCCACCTOC 
TGGGCTCTGC 
GGTCC31GGGC 
TCCTCACCCA 
AGTGACCCXC 
CACACOGGAA 
CTOGATAGTT 
CATQGAOAQC 
CCTGGTGACC 
CCTCCTTOCC 
GGAAGGGGTC 
ACIGACCCTQ 
AOGGAGGTCC 
ATOTTTTOGQ 
GCCAGGGTGG 
GCTGCAjCACA 
GGGCAGCCCT 
CAAGGTCCCC 
AATAAAASGC 



ACGCTCACGG 
GGGCCTGAGG 
ACGCTGGAGA 
ATCAGOCrCC 
GGCTGCrGGA 
GCCAACCRGA 
CGGCAGCA6A 
AAOGACGTGQ 
TGTCTGGATG 
TTGACCCATG 
TGCCCCCAGC 
AGGCOCX3GGT 
CCAAAAATAC 
CAGGGAGGGC 
CTCGCCTGCC 
QTGOCCAAAA 
CCCAfiCTCCC 
C3«W:CCGCTG 
CCCAGCAGAG 
GGGCTGTCTG 
CCTCCCCTCSv 
AGCGTCTGCA 
CCACAGACCC 
□CTTCraCTB 
CAATGTGGGC 
TGTTOaAGQA 
GATCAGGAAC 
GGCIGTCACC 
TOCOTAGQTC 
AAGGGCTTTT 

TCCTGTCACT 
TCCTGTGCCC 
CTGGAGGGGG 
AGQAGGCGQC 

ATCAOGAOCC 
GCTGGGGACT 
GACOOCCTAG 
6A6TGGTCAC 



CCAAGGTCAT 
AGAAGCGGCG 

CA0G6ACACT 
AOGGSATGGC 
TCAAiCAACCC 
TCATGCAGCT 
ACTTCCAGOA 
ACCTCTGCGG 
OCCTCCCAGG 
CCCCGACCTT 
GGCGGTAACT 
AACACAGACG 
CGGCGGCTCT 
TTTCTGCCTT 
GCCATOTATT 
TGCACCX3C3CG 
GLftGCCCACftS 
OCCACCAGOC 
GGTGTCOGC3C 
GCGCAGGCTG 
GGGTGACGCC 
TGCAGTGAGG 
OAOOAGGGGA. 
TGOCCCTOGC 
CCCCGRGGGC 
CAOGGCCTCC 
TGCTCACAGG 
CCTTCCC33AC 
CCAAACATGC 
TC2GCASATGG 
CACTGAGGCC 
CAGcrac{;:AQ 

AGGAGGACTT 
TEAGTGCTGC 
GTCAGGTCGC 
CCCAAiCACAG 
CTGGCACAGT 
QQSTQGCGCT 
TGGTCAGQGC 



CGAGGGCTGC 
CDGGGGCAAG 
TGAAGCX3VAG 
GTGCAGTGAG 
CAGAGGOCGG 
CGAGGTGGAG 
GAAGATCATG 
C3GCCAGTGAC 
CCGGAAG6TC 
CGTGTCAGAG 
CCTCCTGCOC 
GCCCCAA06C 
ATATTTAATT 
GAQCAIQGGGC 
TTAATTTTGr 
TCAGGOACCT 
GAGAAGCAGC 
CX3A0CCTGTG 
AGCCCTGGCC 
ATCCAGQGTC 
CAGAG0C0G6 

tgagacagca 
ggcxx:tocat 
agctgggccc 

ACACAOGGCT 
rrG»GGAlSCA6 
CTGTTCACGG 
GATGCTGGT6 
CCMCCAGCT 
ATGCATTTAC 
CTTOGGAGGC 
ATCAGGGCCC 
GTGGCCCTGG 
QGftGGGTCTG 
TTTGCTITTC 
GATGGCTTGT 
GCAASTCCAC 
GATGCOGGGC 
CRGACCCCaC 
AGTGGCCAAO 
GTGA£3GTGTG 



GGGAAGCCCA 



GCCCAGCTCC 
AAGATGGOOC 
TACCICCCGG 
GTGGACATCA 
ACCAACGQGC 
GACGGCAGCG 
AGC3«3GAAGA 
CAGGAAGGAC 
CTCCTCCTCT 
CCCAGGGACA 
CACCTCAGCC 
JV5GCGGAGAG 
ATGAGGTGCT 

C3U3GGGau:c 

OCCTC3GAGGC 
OCTTCCrCCC 
CACCCCCCAG 
TGOCAGAOCC 
CCCX3VCCTCC 
CCACTGCTGA 
GCGCAGAT6A 
AAAGGCCCAG 
CACAGGGCAG 
GCASGACCCX3 
TQACACAnaT 
GCTGGTGAGA 
GCACTGCAGG 
TGACACTTCC 
C0GCAGG6CC 

rrocxxxswsGc 

GGAGGGGTGS 
GGGGCAGCTG 
ATCACOGTCC 
TCrCTCGAAC 
CGCATAAIAA 
GCCftGGACAG 
CCTACGCTCA 

TTCTTTTGAG 



TQ3AAACCIA M 



8eq ID HOs C74 DtiA Sequence 

Hticlelc Acid Accession #; BC030205>1 

Coding sequences 45.-878 



1 
I 

GTGAGCAGCC 
ACTTATTTCT 
ATAACTCCAT 
AATACAAGCT 
CAACTTACAA 
GGATOGCXAA. 
QAAAAACTGS 
ATTGCIACaA 
TTAAATCTCC 
GACTCAA6XA 
CftJGGTTGCrT 
TGGGAAGATA. 
CC'TTGAAGTT 
CAATGGATCC 
ATAAAAACTX 
ACACSKZniG/IQ 
ATTTATTTAT 
ATAGGAAACT 
GOGTTAACAT 
ATOTAOCTAT 
TTATACTTTT 
AAAICATGATT 
ASOCT GTCTC 
CTTBAACACA 



11 
I 

CCTAAGAfSGC 
CTTGCTATG6 
ATGGCTTGAA 
CACCTACGCA 
GCAGCTAGAfi 
GGGCAlGASTT 
CATTATTGAT 
CCCACAC3GCA 
AGGCTTCCCA 
tTGGTCAGCXSI 
GGCH^ATTAT 
CTGTGGeAGAT 
TCTAAOTGAT 
TGTATC5CAAA 
ITTAGCTOGA 
TTTATGTTGG 
TATTTTTCT& 
TTAAACGAGA 
TttCATATTT 
ATGTATTTGC 
TAAAXCITGA 
TTAAACA23CT 
TATTGTTGGA 
AAAAAAAAAA 



21 
I 



31 
1 



GAASACACTC 
CXSAGCAGCCe 
GAAGCTAAGG 
GCAGCCAGAA 
GGATKCGCXA. 
TAIGGAATCC 
AAGGAGTGTG 
AATGAGTAOG 



AAGGAIGGGG 
OTGTGTACXa. 
CXSCTTGTGTGA 
AAATTGGATI 



GrTGAAATAT 
GAGCTTCCnS 
GCTTCAflTGA 
TCCAXvTCAAG 
AOATTTAGCC 
AATCTTTTGG 
AATQTGAAAG 
AAATGAAAOC 

ATITGAAATT 
ACTTTATAAA 
GCAAAATATT 
ATTTCAGGTC 
i\AAAAAAAAA 



GTGGCGTCTT 
AAGATAAOCA 
GTTTTTTAGA 
ATGACAGnA 
ATGACATCAT 
CAGCTGOAGG 
GAAAAAATAC 
ACTTATAAAA 
AACTOCTTTQ 
CAA^ACATAA 
TCrCATAATC 
<yiCATTTTTC 
TTGGAATOCX 
CA'TTTTCTGA 



ATITTCATAA 
AAAAAAAAAA 



41 

I 

OQATATGATC 
A'x-j.-v»AGGAT 
GAGAGA7U3CA 
ATTTGAAGGC 
TCATGTCTGT 
AGiGGGCQAC 
GAGTGAAAGA 
TACAGATCCA 
AATCTGCIAC 
TTTTOACCTT 

csATGtzaerc 
CAGTAcnsaA 

TTXCCAAATC 
AAOTACTACT 
AAAAAAAAAG 
ATCSCACV6T 
TTTAGGGAAA 
CCACTGCATA 
TATTTGTQGT 
QCTCTA'TGTA 
AAT CaygCA T 
AATGTTTIAr 
ATA'iTxvxxidC 
AAAAAAAAAA 



12 £0 
1320 
1380 
144D 
1500 
1560 
1620 
16B0 
1740 
1800 
1860 
192 D 
1980 
2040 
2100 
21«D 
2220 
22D0 
2340 
2400 
2460 
2520 
2580 
2€40 
3-700 
2760 
2B20 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3692 



51 
I 

ATCIXAATTT 
GQAATTTTTC 
O0GTCTG6CA 
GGCCATCTOQ 
GCTGCTGGAT 
VGXGSMsrXG 
TGGGKIGCCT 
AAGCS^TTT 
TOOCACATTA 
GAAGA TGAOC 
CATSSCmG 
AATGTCATGA 
AAATATGOTTG 
TCTACTOGAA 
GASnSATCAAA 
XATISITTAAC 
ATTGGAAAAT 
GAAATAACa^ 
ATATGTATAT 
CAGTTTTGTA 



GCSkTTATTTA 
AATAAATATC 



Seq ID 130: C75 SNA Sequezice 

Nucleic Acld Accession f^t MH_ooi982.1 

Codiog gequenoei 199.. 4227 " 

1 11 21 31 41 51 

1)1111 

cTcrau::ACA cacacacccc tcccctgcca tcoctoocog gactccggct ccggctccga 

TTGCAATTTG CAAOCTCCGC TGCOGTCGCC GCM3GAGCCA OCAATTGBCC ACCGGTTCAG 

1242 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1060 

1140 

1200 

1260 

1320 

1380 

1430 



60 
120 



wo 03/042661 



PCT/US02/36810 



GTGGCTCTTO CCPCGATGTC CTAGCCTAGG GQCGOCGCSBG CCQGACTTGG CTGGGCTCCC IfiO 

TTC!ACCCTCT GOQGAGTCAT GAiGGGGGAAC GAOCSCrCTGC AOSlCCTGGG CTTGOTTTTC 240 

AGCCTGGCCC GGGGCTCCGA GOrGGGCAAXl TCTCAGGCAG TGTGTCCTGG GACrTCTOAAT 300 

GGCCTGAGTG TGACCGGCGA TGCTGAGAAC CAATACCftGR CaCTGTACAA GCTCTACX3AG 360 

5 AGGTGTGAGG TGGTGATGGG GAACCTTGflG ATTGTGCTCA CGGGACACAA TGCCQACCTC 42 O 

TCCTTCCTGC AGTGOATTCG AGAAGTGACA GGCTATGTCC TCGTOQCCAT GAATGAATTC 480 

TCTACXCTAC CATieCXlCAA. CCTCGGCS&'rG GT<3CQAX36GA CCXAOGTCTA CGATGGQAAG 540 

TTTGCJCATCr TC5GTCATGTT GAACTATAAC ACCAACTCCA GCCACGCTCT GOSCCAGCTC SOO 

CGCTTGACTC AGCTCACOGA GATtCrGTCA GGGOGTGTTT ATATTGAGAA 6AACGATAAG 660 

10 CTTTGTCACA TGGACACAAT TGACTGGAGG GACATCSGTGA GGGftOCXSAGA TGCrcSAIBATA 720 

GTGSlGRAOa ACAATOGCSVe iWSCTGTOCC C3CCTGTCAT0 AGOTTTGCAA GCSGGOGATGC 700 

TGGGGTCCTG 6ATCAGAAGA CTGCCAOACA TTGACCAAGA OCATCTGTGC TCCTCAGTGT 840 

AATGOTCACT GCTTTaGQCC CAACCCCAAC CAGTGCTGCC ATOATGAQT6 TGCOGGGGGC 900 

TGCTCAGGCC CTCAjSGAGAC AGACTGCTTT GCCTGCCGGC ACTTCAATGA CAGTGGAGCC 960 

15 TGIGTACCTC QCTOTCCACA GCCTCTTGTC TACAACRAGC TAACTTTCX2A GCTGC3AACCC 1020 

AATCOOCACA CCAAGTATCA GTATGGAGGA &ITTGTGl?AG CCAGCI6TOC OCATAACTTT 1080 

6TGGTSGATC AAACATCCT<^ ^IGTCAGGGCC TGTCCTCCTG ACAAGA1GGA AGTAGAIAAA 1140 

AATGGGCTCA AGATGTGTGA GCdTOTaOQ GGACTAT6TC OCAAAGCCTG TGAGGGAACA 1200 

GOCTCTGOGA GOCGCTTCCA GACTGTGGAC TCQAQCAACA TTQATQGATT TGTGAACIGC 1260 

20 ACCAAGATCC TG66CAA0CT G6ACTTTCTG ATCACCGGCC TCAA.TCGAGA CCCCTGOCAC 132Q 

AAGATCCCIG CCCTGGACCC AGAGFkAGCSX: AATGTCTTCC GGACnOTAGG OGAl^TCACA 13 BO 

GGTTACCTGA ACATCCAGTC CTGGCCGCCC CACATGCACA ACTTCAGTGT TTTTTCCAAT 1440 

TTGACAACCA TTGGflGGCAG AAGOCICTAC AACOCSGGGCT TCTCATTGTT GATCATGAAG 1500 

AACTTQAATQ TCACATCTCT GOaCTTCOGA TCCCTGAAG6 AAATTAGTGC TOGQCaTATC 1S60 

25 TATATAAiSTG GCAATAGGCA GCTCTGCTAC CACXACTCTT TGAJUCTCGGAC CAAGGTGCTT 1620 

CC3GGaacCTA CX3GAAf3AGG5 ACIAlGACATC AAGCATAATC GGCX3QGaCAG AQhCIGCGTG 1680 

GCAGAGGGCA AAGTGTGTGA CCCACTGTGC TCCTCTGGGG GATGCTOQGG OCCAGGCCCT 1740 

GGTCAGTGCT TGTCCTGTCG AAATTATAGC 06AGGAGGT8 TCTGTGTGAC CCACTGCAAC 1600 

TTTCTOAATG GGOAGCCTCQ AOAATTTGCC CATGAGGCCG AATGCTTCTC CTQCCarJCCQ 1B60 

30 GAAIOOCAAC CCATGOGGGG CACTGOCACA TGCAATGGCT GGGGCTCXGA TACTTGTGCT 1920 

CAATiQTaCCC ATTTTGQAf3A TGOGCCCXAC ITGTGTGAGCA GCIGOCCOCA TGQAGTCCTA 19B0 

GGT6CCAAGG GCGCAATCTA CAAGTACGCA OATGTTCAGA ATGAAT6T06 GCOCTGOCAT 204Q 

OAGAACTGCA CCCRGGGGTG TAAAQGACCA GAGCTTCAAG ACrGTTTAGg ACAAACACTG 2100 

6TGCTGATCG GCAAAACCCA TCTGACAATG GCTTTQACRG TGATAGCAGG ATTGGTAGTG 2160 

J 5 ATTTTCATGA TGCTGOGCGQ CACTTTTCiC TACTGGCGTG GGOGCCGGAT TCAGAATAAA 2220 

AGGGCTKTGA GGCGATACTT GGAftOGGGGT GAGAOCATAG AfiCCrCTGGA COCCAGTGAG 2280 

AAGQCTAACA AAGTCTTGGC CAOAATCTTG AAAGKGACAG AGCTAAGGAA OCXTAAAGTG 2340 

CTTGGCTCGG GTGTCTTTGG AACTGTGCAC AAAQOAGTGT GGATOCCTGA GGGTGAATCA 2400 

ATCAAOATTC CfiOTCTaCAT TRAAGTCATT GAGGACAAGA CTGOAOOGCA QASTTTTCAA 2460 

40 GCTGTGACAG ATCATATGCT GGCCATTGGC AgCCrrGGACC A-3X3CCC2VCAT T6TAAGGCIG 2520 

CTBOGACTAT GOCCAOOOTC ATCTCIGCAG CTTGTCACTC AATATTTQCC TCEGGOnCCT 2580 

CTGCIGGATC AT6TGAGACA ACACCX3GGGG GCACTGGGGC CACAGCTGCT GCICAACTGG 2640 

GGAGTCACAAA TTGCCAAtQGa AATGTAOTAC CTTGAGGAAC ATGGTATGGT GCJ^TAaAaAC 2700 

. CTGGCTGCCC GAAAOGTGCT ACTCAAGTCA CCCAQTCRGG ITC3«3GTGGC AGATTTTGGT 2760 

45 QTOQCraAOC TGCTGCCTCC TGATGATAAG CAGCTGCTAT ACASTGAGGC CAAGACTCCA 2820 

ATTAA6TGGA XGGCCCIT<3A GAGTATCCAC TrCGGGAAAT ACAC&QUCCA GM3TGATGTC 2880 

TaOAaCIATa GTGTGACAGT TTGGGAGTTG ATGAjOCTTGG GGGCH3AGCC CTATGChOGG 2940 

CTAOGATTQG CTGAAGTACC AGACCTGCTA GAGAAGGGGG AGCGGTOXSGC ACRfiOCCCAG 3000 

ATCXGCACAA TTQATQTCTA CATGGTGATG GXCAAGTGTT GGATGATTGA TGAOAACATT 306O 

50 CGCXX^AACCT TTAAAGAACT AGOCAATGAG TTCAGCAGGA OrGGGCOGAGA CCOiGCAOGG 3120 

TAIGTGSrCA TAAAGAOAaA GAGTGGGOCI GGAAXAGOCC CTGOGCCaGA GCGXXATGGT 3180 

CTQACAAACA AGAAGCXAGA GGAAGTAGAG CTGGAGGCAG AACTAGACCT AGAOCTAGAC 3240 

TTGGAAGCAG AGGAGGACAA CCTGGCAACC AOCACACTOa GCTCCSCCCT CftGOCTACCR 3300 

GTTGGAACAC TTAATCGGCC ACnTGGQAGC CAGAGOCTTT TAA6TCCATC ATCTGGATAC 3360 
55 ATGOCCATGA ACCAGGGTAA TCITOGGGGG TCTTGCCAGG AGTCTGCAGT TTCTGGGAGC 3420 
AGTGAACGGT GCGCCX3BTCC AGTCTCTCTA CACCCAATGC CS^GGOGGATQ CCTOGOVrCA 34BQ 
GA5TCATCA6 AGGGGCATGT AACAGGCTCT GTV^GCTSAGC TOCAGGAGAA AffTGTCAAIG 3540 
TGTAGAAGCC GQAfaCABGAG CCOGAGCCCA GGGCCAOGCG GAGATAiGOGC CXACCATTCC 3600 
CAG0GC3CACA GTCTGCTGAC TCCTGTTACC CCSUrrCTCCC CACXXMGGTT AGAGGAAGAG 3660 
OU GAT6TCAA0G GTTA1X3TCAX GCCSGATACA CACCTCAAAS GTACTOCCTC CTCCCGGGAA 3*720 
GGCACCICTTT CTTCAGTGGG TCTCAGTTCT GTCCXGGGTA GT6AAGAAGA AGATGAAGAT 3780 
GA6GAGTATG AAXACATGAA COGGAGGftGA AQGCACAGTC CACClCAm: CDCTAGGCCA 3840 
AGTTOCCTTO AGQAGCTQOQ TTATOAGTAC ATGGATGTGG GGTCAGACCT CSVGTOCCTCT 3900 
CTGGGCAGCA CACAGAGTTG CCCACTCCAC OCTGTACCCA TCAIGCCCAC TGCRGGCACA 3960 
05 ACrCCAGATG AAGACTATGA ATASTATGAAT CGGCAAOGAG ATGGAGQTGG TCCXGGGGGT 4020 
GATTA1GC3K3 CCATGGGGGC CIGCCCRGOL TCTOAGCAAG G6XA3GAASA GATGASAGCT 4080 
TTTCAGGGGC CTGGACATCA OSCCCCCCAT GTCCATTATG CCOGCCL'AAA AACTCaAOGT 4140 
AGCXTAGAGG CTACAGACTC TGCCTTTGAT AACCCIGATT ACIGGCATAG CAGGCTTTTC 4200 
CCCAAQGCTA ATGCCC&QAG AACGTAACTC CIGCTCCCTG TGGCACTCAQ QOAGCATTXA 4260 
70 ATGGCAGCTA GTGCCtTTAG AGGGTAOCGT CXTCTCSCCTA TTC3CCTCTCT CI0CCA6GTC 4320 
GCAGCOCCTT TTCOCCAGTC QC»anCAATT CCATTCAATC TTTGGAGGer rSTTAAACATT 4380 
TTGACACAAA ATTCTXATGG TATGZAGCCA GCTGXGGACT XTCTXCTCIT TCCCAACC3CX: 4440 
AOQAAAGGTT TTCCTTATTT TGTOTGCTTT CCCAGTOCCA TTCCTCAGCT TCTTCACAGO 4500 
_ CACTCCTGGA GATATGAAGG ATTACTCTCC ATATCCCTTC CTCTCAGGCT CTTGACTACT 4560 
75 TGGAACTAGG CTCTTATGTG TGCCTTlGrT TCCCATCAGA CTGTCAAGAA OAGGAAAGGG 4620 
AGGAAACCTA GCAGAGGAAA GTGXAKTTTT GGTTTAXGAC TCIIAACCOC CTAGAAAQAC 4680 
AGAAGCTTAA AATCTOTQAA GAAAGAGGTT AGGAGTAGAT ATTQATTACT ATCATAATTC 4740 
AGCACTTAAC TATGAGCCAG GCATCATACT AAACTTCAOC TACATTATCT CACTTAGTCC 4800 
TTTATCATCC TIAAAACAAT TCTGTGftCAT ACATATTATC TCATTTTACA CAAAGQGAAG 4860 
oO TOGGGCATGQ TGGGTCATGC CTGTAATCTC AGCACrTTGG QAQQCTGAGG CAGAAGGATT 4920 
ACCTGAGGCA AGGAGTTTGA GACCAGCITA GCCAAOITAG TAAGACCCOC ATCIC 4975 



Seq ZD NOs C76 DHA Gequence 



1243 
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Mtacleic Acid AccesBion #e Nrt_001216.1 
Coding sequence 1 43.. 1423 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 



1 
I 

gcccgtacac 
agcccctggc 

ct<;tcact<3C 
tccckcttgg 

gagqatctac 

aataatgccc 
cogccciogc 
c&ccoccagc 

CTCOOGCXXSC 
OdCCTGGGC 
CTGCftCTGGG 
CCrGCC6AQ& 
GGGGGCCCGG 
AGIGCCTATQ 
CAGGTCCCAG 
TATGAjSGOOT 
CAGACAGTGjV 
GGTGACTCXC 



AATTCCTGOC 
ACCAGCGTCa 
QTQAGCTACC 
TGTGAGAAGC 
ATGOCACTTC 



11 
I 

ACCGTGTOCT 
TCCCTCTGTT 
TGCTTCTGAT 
GAGGAQGCTC 
AZTCACCCAG 
CTGGAGAGGA 
TAGAGQATCT 
-ACAGGGACAA 
CCCGGGTOTC 
TCGCaSOCTT 
TCCCAGAACT 
TAGA6ATGGC 
GGGCTGCAGG 
TCCACGTGGT 
GAOacCTOGC 
AGCAGTITGCT 
GACTGGACAT 
LT LTIU AC^AC 
TGCTGAGTGC 
GQCTACAGCT 
TCCCT6CXG6 
TGGCIGCTGG 
CXyiTCQTTQT 
GCCCAGCAC5A 
CAGCXAGAGO 
CTTTTAACTG 



21 
I 

GQC3ACACCCC 
GATCCCGGCC 
GCCTGTCCAT 

TTCTGGGGAA 
AGAOGAGQAT 
G6ATCTAOCT 
ACCTACIQTT 
AGAAGGG6AT 
CCCAQCCTQC 



TCTGGGTCCC 

TCGTOCGGOC 
TCACCTCAGC 



31 
I 

ACAGTCAGCC 
CCTGCTCCA6 
CCCCAGABGT 
GATGAiXCAC 
CCACCGGCSAG 
GAAGTTftAGC 
GAGGCTOCTG 
GACCAGAGTC 
OCGGGCCQCT 
CTGCGCCOCC 
AACAATGBGC 
G0G006GAGT 
TCSGAGCACA 
ACCGCCTTTG 



GTCTOGCTTG 
ATCIGCACrC 
ACCGCCCTGT 
TAAGCAGCTC 
GAACITCCX3A 
AGTGGACAGC 
TGACATGC!TA 
<3CW3ATGAGA 
OGTA6CCGA6 
OiTCTQAGGG 
CCAASAAATT 



GAAGAAATOG 
CTGCCCTCTG 
GCCCAGGGTQ 
CACftOCCTCT 
GGGAOaCAGC 
AGICCTOGGG 
GCCCTGGTTT 
AGGCAGCACA 
ACTGGAGCCT 
QGAGGOGQTA 
TTTTAAAATA 



43. 

1 

GCATG6CTC3C 
GCCTCaCTGT 
TGCCCCGGAT 
TGGG06^GaA 
AGGAGOAICr 
CTAAATCAOA 
OAGATCCTCA 
ATTGGOGCTA 
TCCAQTCCCC 
TG6AACTGCI 
ACAfSTGTSCA 
AOOGGGCTCT 
CTGTGGAAGG 
OCAGAOTTGA 
AGQAjQGQCCC 
CTGAGGAAGG 
ACTTCAGOCG 
TCATCTGGAC 
CTGACAOOCT 
CITTOAATGO 
CI0CTGW3CC 

ttbqcxztcct 

GAAGGGGAAC 
AGAGGCTGGA 
ACrGTCCTOT 
AATATTTATA 



51 
I 

GCAACTQCIG 
6CAGGAGGAT 
QOATCTGCCC 
ACCIOGAGAfi 
AGAAGAG6GC 
AGAACCCXZnS 
TOGAGGOGAC 
GGrGOATATC 
GGGCTTCCAG 
ACTGACCCTG 
GQVSCTGCAT 
CCACOSTTTC 
CGAGGCCTTQ 
GGAAOAAAAC 

CTACrrOCAA 
TGTGTrTAAC 
GTG66GACCT 



AGTGCAGCTG 

ttttgctgtc 
caaagggggt 

TCTIGGAGAA 
CCTQCTCATT 
AT 



Seq ID HOs C77 DMA Sequence 

Hucleic Acid AcceBslcm NM_004207.X 

Coding sequences €3.^1460 



1 

I 

GGGQAE3AGGC 
CCATGOGAGG 
GQaaCTGGGC 
AGGCGGlCAlG 
CAGCCTGQAT 
TGTGCGTGAA 
TGQGCaTGGr 
TCATCACSGGG 
ACTTCAGCAA 
TGTGTGCCCr 
TCCTCATCCT 
TGGTGGTCAC 
TGAfiCfiTCIT 
TGGGGCTCTT 
ACACCAAGGC 
COGt33GGCrT 
TCTGCATGTT 
GCCTCBTGGT 
TCGAOGTSCT 
TQCIGATGOA 
CGACCOVCGT 
TGATTTTGCT 



11 
1 

GGGCTGAGGC 
GGCCQTGGTG 

TGTCTTCTTC 
CTCCTOCATC 
CCGCTTTOQC 
GGCTGOGTCC 

GOOGCGCCCC 
GAaCCCGCTG 
GGGCGGCCTG 
OGCCCAGCCG 
COGGGACOGC 
GSrCCGGCCC 
CBCCTTCCTG 
06TGGQ0GGG 
CTTCIU^GGGC 
CTTCTGCATC 
CATGGGCAIC 
GGOGGTGGCC 
CTACfUTGTAC 
GCTGGGCAAC 
CGOGGAOt^ 



21 
I 

GGCCCAGOOG 
GAC30AGGGCC 
06CTGXITCXS 
AAGGAGCrCA 
CTGCTGGCCA 
TGCCX3C3CCCO 
TXTTOCCGGA 
GCACrCAACT 
ATGGCCaAGS 
GGGCAGCTGC 
CTGCrCAACT 
GGCTOGGGGC 

GTQTTGGTGG 
CTCRCCftTOC 
CTTGGGAAGG 
CTGGCGQACC 
TTCTTTGGCA 



6TGCTOGTOG 
GTQTTCATCX: 
TTCTTCTGCA 



TGOGGGAGGT 
CCXrCGOAAAC 
AAGOOGGCAA 
GOQCTCCAGC 
GGGGGTGGGA 
MGGCATGCr 
CTCAASGAOC 
CGGCU3ACAG 
GGCTCftCA*rG 
CraGAAGATO 
TT 



AA6TGTCTGA 
OGCTTGCTAT 

CACCAaOKSC 
TOGAAACGCA 
GCTGGCAQGG 
GGGOCTGTGC 
GAAAXAAACC 



GTOSCTGGGC 
TTATTTTACa 
GOGGATOQTC 



GCGGGCTGCT 
TGCrrCGAGA 
CAQOTGCTGC 
CCACCCCTCT 



31 
I 

OGGCAGQTGA 
□CACAGGCGT 
TCATCACTGG 
TACACSGAOTT 
TGCTCTAC3GG 
TCATGCTTOT 
GCATC3^TCCA 
TCCAGCCCTC 
GGCXGGGGGC 
7GCAGGACOG 
GCTGGGTGTG 
GGGCGCGACC 
TTTACGCGGT 
TGAGCIAGOC 
TGGGCTTCAT 
TGCGGCCCTA 
TOGOGGGCTC 
TCTCCIAQ36 
ACAAGTTCTC 
GGCCCOCnC 
TCGCGGaOGC 
TTAGGAAGAA 
AC3UUSCCTCC 
AGOCTGAGAA 
GGGGGCGGCA 
AACTGGACTG 
6GGC6ATGAG 
ATCTGCIGGTQ 
GCTGOCAGGT 
CAAOGTGACT 
GTGGGGOOCT 
TGAGTGTCTT 
T6GAGTGTTC 



41 
1 

GGGGGAAOCA 

CTTCrCCTAC 
TGGGATCX3GC 
GACAGGTCOG 
GGOGGGTCTC 
GGTCXACCTC 
GCTCATCATG 
AGCAGGTAGC 
CTACGGCTGG 
TGOCGCACTC 
CrC3CCGGCGC 



TGACATCTTC 

TAOGGOGGQC 
CA3X36TGGGG 
CA6TGCCATT 

OGGA6GCAAA 
□GAQGTGCTC 
GOOCAAAGAG 



51 

I 

ACCCTOCTGG 
GACBOCGGCT 
GCCITCCGCft 
TACAOCXaACA 
CTCTGCAJ3T6 
TTTGCGTOSC 
ACCACTGGGG 
CTGAAOOGCT 
CCTGTCITCC 
GGGOGCX9GCT 
ATGAGGCOCC 



6TCATGGTGC 
GGGGTGOCCG 
60G0G6GC06 
CTCTTCAGCr 
GA^AOGGCG 
GCOCIGCAGT 
GGCCTGGTGC 
CTCCTGGfiXG 
ACCTCCTCCX: 
CCACAGOCTG 



AAftOGGGGAG GT66TTCACA 
GGCACA0GGA GGAGGTACAG 
GCTCKGGCAQ GQCCACX^T 
TGTTTTGAGG GGGAAGGT6G 
QCAAlGaTTAC 
CACTGCTATG 
TTAATGGGAa GGTGGGTGGG 
CTGCAGOOCG TCCTACOCTG 
GGGGACAGCT CTTTCCAOCC 
TCeXGCGQAA TTCAAAAAGG 



60 

120 ' 

180 

240 

300 

360 

420 

4G0 

540 

600 

660 

720 

760 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1552 



60 

120 

IftO 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

12 DO 

1260 

1320 

1B80 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

1982 



75 Seq ID NO: C78 OKA Sequence 

nucleic Aoid AcceBsion ftt 11K_D00358.1 
Cading sequence: 48.. 20 99 

Of. 1 11 21 31 41 51 

80 I I I 1 1 I 

OCTTGCSCCGT CGGTCGCT9WS CTCGCTCOGT GCG06TCGTC CCSCTOCAnO GCGCTCTTCG 60 

TGGQGCiTGCr GGCTCT06CC CTGGCTCTGG aOCTGOQCOC CGOGGOGACG CIOGCGGSrC 130 

OCBCCAAGTC GCCCTACCAG CIGGTGCTOC AGCACAGCA6 aCTOQGGGGC OGOCAGCACG 180 

GC3CCCAAC3GT 6TGTGCEGTQ CAOAAGGTTA TTGGCACTAA TAGGAAGTAC TTCACChACI 240 

1244 
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10 



GCAAXBCftGTG GTACCRAAOG AAAATCTGTO GCRAATCAAC AGTCATCAGC TACGAiSTGCT 3 00 

<3TGCT<SGATA TGAAAAGQTC CCTGGGGAGA AGGGCTC3TCC AGCAGCCCTA CCACTCTCAA 360 

ACCTTTAOGA GACCCTGGSA GTCGTTGGAT CCACCACCAC TCAGCTGTAC ACGGACOSCA 420 

OGGAGRftOCr GAGGGClGAa ATOCXAGGGGC CCGGCftGCTT CACCATCTTC GCCCCTAlGCA 4 BO 

A06AI3GCCT6 QGCCTCCTTG CCftGCTGAAG TGCTGtSACTC CCTGOTCRGC AATCTCAACA 54 0 

TTGAGCTGCT CAATGOCCTC CGCTACCATA TGGTGGGCAG GCGavsraCTO ACTGATtSAGC 600 

TGAAACACGG CATGAOJCTC ACCTCTATGT ACXIAGAATTC CAACATCCAG ATCCAOCACT GGO 

ATCCTAATGG 6ATT6TAACT GTGAACT6TG CCCGGCTCCT QAAAGOCGAC CACCATGCftA 720 

CCAA00GG6T GGTGCACXnX! ATCX3ATAA0Q TC3lTCTCX3kC CftTCACCAAC AACATCQiec 780 

AGATCATTGA GATCGAGQAC ACCTTTGAGA CCCTTOGGGC TGCTCTGGCT GCATCAGGGC B40 

TCAACACGRT GCTTGAAGQT AACGGCCSyST ACACGCTTTT GGCCCCOAOC AArGAGGCCT 900 

TCGAGAAGAT CCCTAnTGAG ACttTGAACC GTATCCTGGQ CGACCCAGAA GCCXn^SAGAG 560 

ACXTGCTGflA CAACCACATC TTGAAGTCAG CTATGTOTGC TGAAGCCATC GTTGCGOGGC 102O 

- _ TGarCTBTAGA GflCC3CTG(»G GGCACOACAC TOOftOSTGCSQ CTGCAGCGGG GACATGCTCA lOBO 

1 J CTATCAACGG GAAGGOGATC ATCTCCAATA AAOACATGCT AfiCCACCAAC GGGGTGATCC 1140 

ACTACATTGA TGAGCTACTC ATCCXZRQACT CAGCCAAGAC ACTATTTGAA TTOGCTGCAG 1200 

AQTCTGATGT GTCCACAGCC ATTGACCTTT TCAOACAAGC CGGCCTCQQC AATCATCXCT 1260 

CTGGAAGTGA GCGGTTOACC CTOCTGQCTC CCCTGAATTC TQTATTCAAA GATOOAACCC L320 

CICCMTTGA TGCCCaiftCA AGQftATTTGC TTCOGAACCA CATAATTAAA GACCaSCTGO 1380 

CCTCTAAGTA TCTGTACCAT GGACaaACCC TGGAAACTCT GGGOGGCAAA AAACTGAGAG 1440 

TTTTTGTTTA TOGTAATflOC CTCTGCATTG AQAACAGCTG CATCGCGGCC CACGACAAOA 1500 

GGGGGAGQTA CQGGACCCTG TTCACGATGQ ACCGGGIGCT GACCCCOCCA ATGGGGACTG 15S0 

TCaTGGATGT CCTGAAGGGA QAGAATCGCT TTAfiCaTaCT GGTAGCTGCC ATCCAGTCTO 1S20 

CBGGACTOAC GGAGACCCTC AACJOSSGAftO QAGTCIACAC AGTCTTTGCT CCCACAAATG 1680 

-6 J AAGCXn-TCGG AGCCCTGOCA CCAAfiAGAAC GGASCAGACT CTEGOGAGAT QCCAAGGAAC 1740 

TTGCCRAC31T CCTGAAATAC CACATTGOTG ATQAAftTCCT GGTTAGGGGA GGCATCOOGG 1800 

OCCTGQTGOG GCTAAAGTCT CrCCAAGGTG ACAAGCTGOA AGTCAGCTIG AAAAACAAIG I860 

TQGTGAGTGT CAACAAOGAG CCTGTTGCCG AGCCTOACAT CATGGCCACA AAT66C6TOG 1920 

Qrt TCOJTGTCAT CACCAATGTT CTGCRSCJCrC C3VG0CMW31G AGCTCAGGAA AfiflGGGGATG 1900 

AACTTGGAGA CTCTGOGCTT QAQATCTTCA AACAAGCATC A)G0GTTTTC3C AGGGCTTCCC 2040 

AGftGGtCTOT GOGACTAQCX: CXTEGTCTATC AAAAGTTATT AGROAGGATG AAGGATTAGC 2100 

TTGAAGCACT ACK3GAGGAA TGCACCACGG CAGClCTCOS OCAATTrCTC TCAGATTTCC 2160 

ACAGAGACTG TTTGAA TGTT TTCAAAACCa AGTATCACAC TTTAATGTAC ATGGGCCGCA 2220 

-ac CCaWAATGAG ATGTGftGGCr TQTQCATGTG GGGGAOOAGG GAGAGAGATQ TACTTTTTAA 2280 

^^^T????^ CCCTAAACAT GGCTGTTAAC OCACIGCATG CAGAAACTT6 GATQTC31CTG 2340 

40 



50 



55 



70 



CCTOACATTC ACTTCCAGAG AGOACCTATC CCRAATGTQQ AATTGACTGC CTATGCCAAO 2400 

TCCCTGOAAA AOGAGCTITCA GTATTGTGGG GCTCATAAAA CATQAATCAA GCAATCCAGC 2460 

CrCATGGGAA 6TCCTGGCAC AGTTTTTGTA AAGCCCTTQC ACAGCIGGAG AAATGGCATC 2520 

MTATAMCr AlGaGr rOAA ATGTTCTGTd AAATGTGTCT CRCATCiaCA CGTGGCTTGG 2580 

AQSCTTTTAT GGOGCCCIGT CCAQOTAGAA AA6AAATGGT ATCTAGMSCT TAGATTICCC 2G40 

TATTCTGACA GAGCCATOGT GTGTTTGIKA TAATAAAAOC AAAGAAACAT A 2691 



. - 8eq ID NO: C79 DH& Sequence 

inicl«ic Add Accession #: in4_O06536,2 
Cbding seqtLence; 10d« .2940 

{■ " 21 31 41 51 

ACCTAAAAOC TTGCAA6TTC AOOAAGAAAC CATCTGCATC CATATTSAAA ACCTGAGACA 60 

A3?OTATGCAG CAGGCTCAGT GTGAGTGAAC TOGAGGCTTC TCTAC3UVCAT GACCCAAAGG 120 

AGCATTQCAO GTCCTAnTTG CAACCTGAAG TTTGTGACTC TCCTGGTTGC CTTAAGTTCA 180 

GAACTCCCAT TCCTGGGAGC TCGAGTACAfi CTTCJUMaACA A1X3GGTATAA TOGATTOCTC 240 

MTGCAATTA ATCCl CAQOT ACClGRaAAT CWJAACCICA TCTCRAACAT TAAjGGRAATG 300 

ATAflCT GAaG CTStXTTT'SA tMATTTAAT GCTACCAAOA ^UU»GXATT TTTCS^GAAAT 360 

ATAAAGATTT TAATAOCTQC CACAIGGAAA GCTAATAATA ACAOCSUVAAT ATUUVCAAGftA 430 

XCATATCAAA AfSGCAAATGT CATAiGnGACT GACTGGTATG GGGCACATB3 AGATGATOCA 480 

TACACCCTAC AATACRBRBG GTGTGGftAAA GAGGQAAAAT ACATTCATTT CACACCTAAT 540 

TTCCTRCTGA AnSATAACTT AACROCTOGC TACSSGATGAC QAGGCOSfiOP GrTTGTCCAT 600 

OU GAATGGGOCC ACCTOOGTrO GGGTGTGTTC QATOAGTATA ACAATGAGAA aOCTTTCTAC 660 

ATAAATGGGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CTGACATCAC AGGCATTTTT 720 

GTQTGTGAAA A/KSGTGCIXG CCCCCAAGAA AACTOTATTA TTAGTAAQCT TTTTAAAfiAA 780 

Qg^TG CACCT TTAgCEACAA TAGCAOOCSUV AATGCAACTG CATCAATAAT GTTCATGCaA 840 

- - AGTTTATCTT CT aTgOT TGA AZTTTGTAAT GCAAfiTACXXr ACAACCAAGA AGCACCAAAC 900 

CTACRGAAOC A GA1G reC3«? CCTCAlGAAGT GCnTGGGATG TAATCACAGA CXCXGCIGAC 360 

TTTCaCcaCA GCTTTCOCAT GAATGGGACT GRGCTTCC3«: CTOCTCCCAC ATTCTCGCTT 1020 

GTACAGGCrG GTGACAAAQT GGTCTGTTTA GTgCTGGATG TGTCCAG<3VA GATGGCaOAG 1080 

CCTGACAGRC TCCXTCAACT ACAACAAGCC GCAGAATTTT ATTTCATGCA GKTTGTVStA 1140 

ATTCATACCr TGSTGGGt2AT TGCCA6TITC OAQUGCAAAG GAGAGATCAG AG0CC3kaCIA 1200 

CAOCAAATTA ACA6CAATGA TGRTGGAAAG TCQCTOGTTT CAIATCTGCC CACCACXGIA 1260 

TCRGCTAAAA CAGACSOTCAG CATrTTGTTCA GGGCTTAASA AAGGATTTGA GGTGOTXGAA 1320 

AAACTGAATG GAAAAGCXTR TGGCTCTGTG ATQATATTAG TGACCAQCGO AGATGATAAQ 1380 

CTTCTTGGCA ATTGCTTSVCC C3VCTGXGCrC AGCAGTGGTT CAACAATTCA CTCCATTGCC 1440 

^ _ CTGGGTTCAT CTGCAGOOCC AAATCTGSU? GAATTATCS^C GTCTTACAGG AGGTTTAAAB ISOO 

/J TTCITTGTTC CAGATATATC AAACKXaAT A6CATGATTS ATGCTTTCAG TAfiAATTTO ISfiO 

TCTGGAACTQ GAOACATTIT CCAGCAACAT ATTCAflCTTG AAAGTACROG TGAAAATGTC 1620 

AAAOCTCACC ATCAAITGAA AAACAC3U5TG ACTGTGGATA ATACTGTGG6 CSlAOGACACT 1680 

A1GTTTCTAG TTAQ3TGGCA GGCCAGTGGT GCTCCrOAOA TTATATTflaT TGATOCTGAT 1740 

GQAOCTUUUVT ACTACRCftAA TAATTTTATC ACCAATCTAA CTTTTCGGAC AGCTWSTCTT 1800 

OU TGGATTCCAG GAACAGCIAA GCCTGBGCaC TQGACTTACA CC3CTGAACAA TftGCXMTCAT 1860 

TCTCTGC&AG CCCTQAAAGT GACAGTGAOC TCTDOCGCCT (^CAACTCftGC TGTGCCCCCA 1920 

GCCACTQTGG AAGCCITXOT GGAA7U3AGAC AGCCTCCATT TTCCTCATCX: TQTGATCATT 1980 

TATGCCAATG T6AAACAGGG ATTTTATCXX: ATTCTTAATG CCACIGXCAC TGCX3\C3W3TT 2040 

GAGOCAGA6A CIGGAGATCC TGTTACGCTO AGACTCCTTO ATGATGGAGC AGaTQCTGAT 2100 
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10 
15 
20 
25 

30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



GTTATAAAAA 
TATAGCTTGA 
CCAGGGAGTC 
GCTCCUU^GJk 
AGCTCAGGAG 
CCACCATGCA 
TGOACAGCAC 

AAGGGAAATC 
AOQAATGGAC 
GGAATAGGA6 
CCTCTGrrTA 
QQAGTTTTAA 
CATACTTTAA 
ATAAATATCC 
CATACTAACA 
ATACA6ATAA 
CCTTACACTT 

AATAGCCCCA 
TCATTTAGTT 
TTTACATGAA 
CTTCCrATTX 
TTTCRCTGTA 
'XTTATSACAA 
TTTCXAAOTT 
TACCXAOGAA 



ATGATGQAAT 
AAGTQCArDGT 
ATGCTATQTA 
AATCAOTAGG 
GCTCCTTTTC 
AAATTATTGA 
CTGGAGAAOA 
TACAGAATAT 
CTCAGCAAGC 
CTGAACATCA 
CAATGGATAG 
TTCCCCCCAA 
CAGCAATQGG 
GCAQGAAAAA 
AAAGTGTCTT 
AAGTCAAATT 
GATTTTTACA 
TGGCTATGAA 
GGSTAAAGIC 
AGCftiaftGiAAA 
ACTTTGATTA 
GATCATQCTA 
TGTTATATAT 
AGAGGXAACC 
AGGTCTATTG 
TATTGCCTTG 
A 



TTACTCQAOG 
CAATCACTCT 
TGTACCAGQT 
CAGAAATGAG 
AGTGCTGGQA 
CCTGGAAGCr 
CTTTGATCAG 
CCAAGATQAC 
TGGCATCAQG 
GCCAAATGGA 
GAACTCCTTA 
TTCTGATCCT 
TXrOATAGGA 
GAQAGCABAC 
CCIICTTAGA 
AACATCAAAA 
TGGTAQATCA 
CAAATAAXAA 
GGACCAOTGT 
AOGAGOGXAO 
ATTTTTCTTT 
TATTTTATAT 
ATTTCAOATG 
TTTAACAAXA 
AATTTATTTG 
GGTI3\TTATG 



TATTTTTTCT 
CCCAGCATAA 
TACACAGCAA 
GASGAGOSAA 
GTTCCAGCTG 
GTAAAAGTAG 
GGCCAGGCTA 
TTTAACAATG 
GAGATATTTA 
GAAACACATG 
CAGTCTGCTG 
GTACXTTGCCA 
ATCATTTGCC 
AAGAAAGAGA 
TATAAGACCC 
CTGTATTAAA 
ACAATTCTTT 
AAATTATTCT 
CftAGGAAAGT 
GTCTGCATTA 
TCTCCTTATC 
ATGTAfiCCCC 
ACATCTCCCT 
TGGGTATTAC 
TNTGrAAETTT 
GAAT6ATAGT 



CCTTTGCTGC 
GCACCCCAGC 
ACGGTAATAT 
AGTQGGGCrrT 
QCCCCCACCC 
AAGAGGAATT 
CAAGCTATGA 
CTATTTTAGT 
CGTTCTCACC 
AAAGCCACAG 
TATCTAACAT 
GAGATTATCT 
TTATTATAGT 
ATGGAACAAA 
ATGGCCTTCG 
ATGCATTC5AG 
TTGGGGGTAG 
TTAAAGTAAT 
TTQTTTTATT 
TAACTGTCTG 
TGTGCAGTAC 
TAATGCAAAG 
GCTAATGCTC 
CTTTGTCTCT 
TCTACTCCCA 
TATAGCCCOF 



AAATGGTAOA 
CGACTCTATT 
TCAGAT6AAT 
TAGCCGAGTC 
TGATGTGTTT 
□ACCCTATCr 
AATAAGAATQ 
AAATACATCA 
CCAGATTTCC 
AATTTATGTT 
TGCCCAGGOG 
TATATTGAAA 
TGTGACACAT 
ATTATTATAA 
ACTACAAAAA 
TTTTTGTACA 
ATTAGAAMC 
GtCTTTAAAG 
OAGGTGGAAA 
^TGAAGCAA 
AGGTTGCTTG 
CTCTTTAOCT 
AGAGATCTTT 
TCATACOGGT 
TCAAAGCAGC 
TATAATGCCT 



Seq IB HOs Cao DHA Sequence 

Nuclei c Acid Accession frs Eos Bequence 

Coding sequence ? 1..1413 



ATGAAOTTTC 
AaCTCTACAA 
TAIGGCXTTG 
AAGGAAAAAA 
ACWTClAqCC 
AGOGAAATSC 
TACACACCTS 
TGGAGTAATG 
STGGTTTTTG 
CTAQCCCAXO 
GAAITCTGGA 
GOC3CATTCCT 
AAATATGTTG 
CTQTAXGGAG 
CTCrGTOACC 
TTCAAAGACA 
ATTTCTTCCr 
AGAAATCAAG 
GRGCCAAATr 
GATGCAGCTG 
TOQAOGTATQ 
AACTTCCAAG 
TATTTCTTCC 
ACACTQAAAA 



la 

I 

OTCTAATRCT 
GCCTQGAAAA 
AGATA?VACAA 
TCCAAGAAAT 
TQGAGAXGIAT 
CAGG6G060C 
ACATGAACCG 
TTACCCCCXT 
CCCGTGGAGC 
CrrTTGGACC 
CTACACATTC 
TAQQTCTtEOa 
ACATCAACAC 
ACCCAAAAGA 
CCAATZTQAG 
GGTTCTTCTG 
TATC3GCCAAC 
TTElTCTTTT 
J^'CCCA flGaO 

tttttaaccc 

ATGAAAGGftS 
GAATCQGGCC 
A/iGGATCTAA 
GCAATAGCTG 



21 
1 

GCTCCTGCAG 
AAATAATGTG 
ACITOCAGTG 
QCAGCACTTC 
GCSVCGCRCCr 
CGTATGGAOG 
TGAGSATGTT 
OAAATTCAGC 
TCATQGAGAC 
IGGATCTOaC 
ABGAOGCACA 
CCATTCTAGT 
ATTTOGOCTC 
GAAOCAAOGC 
TITTGATGCT 
GCTGAAGGTT 
CTTGCCATCT 

CATACATTCT 
ACQTTTTTAT 
ACAGA'TGATG 
TAAAATT6AT 
CCAATTTQAA 
GTTrGGTTGT 



31 

I 

QOCACTGCTT 
CXATTTC3GTG 
ACAAAAATGA 
TTGGGTCTGA 
COATGTGGAG 
AAACATTATA 
OACTAOGOiA 

aagattaac:a 

TTOCATGCTT 
ATXGGAGGG6 
AACTTGTTCC 
GATCCAAAGG 
TCTGCrGATQ 
IIGCCAAATC 
GTCACIACSCa 
TCTGABAGAC 
GGCATTGAAjG 
AAATACTOGT 
TTTGGITTTC 
AG^CCXACT 
GACGCrOGTT 
QCAGTCTTCr 
TATGACTTOC 
IGA 



41 
I 

CTGGAGCrrCT 
AAA3ATACTT 
AATATAGTGG 
A?VGTGACCGG 
TaCCOGATGT 
TCAOCTACAG 
TOOCGAAAGC 
CftCaCATGGC 
TTGATGGCAA 
AlGCACATTT 
TCACTOCTGT 
COQTAATGIT 
ACATACQTGG 
CTGACAATTG 
TGGGAAATAA 
CAAAGACCAG 
CXGCTTATGA 
TAATTAGCAA 
CTAACTTTGT 
TCTlTGrAGA 
ATOCCAAACT 
ACTCXAAAAA 
TACTGCAAOQ 



51 
I 

TCCCCTGAAC 
AOAAAAATTT 
AAACTTAATG 
GCAACTGGAC 
CCATCATTTC 
AATCAATAAT 
TTTCCAAGTA 
TOACATTTTO 
AGGIGGAATC 
COATGAGGTIC! 
TCACGAGATI 
CXCCACf!TAC 
CSVXTCAGTCC 
ASAACCAGCT 
GAICTTTTTC 
TOTTAATITA 
AATTGAAGOC 
TTTAAGACXA 
GAAAAAAATT 
TAACCAGTAl 
□ATCACCAAG 
CAAAXACTAG 
TATCAGCAAA 



aeg ID KO: cei i3NA Sequence 

ivucleic Acid Accession #i Bob eeg^eince 

fV^^^ 'fnj sequences X,.X413 



1 
1 

ATGAAGTriJC 
AOCTCCACAA 
TATOSOCTTO 
AAGGAAAAAA 
ACATCIACCC 
AGGGAAATGC 
TACACACCro 
TGGAGTAATG 
OTGOTTTlTQ 
CTAGCCJCATG 
GAATTCTGOA 
GQC3CATTCCT 
AAATATGTTG 
CTGTATGQAG 
CTcrGTOAOC 
TTCAAAGACA 
ATTTCTTCCr 
AGAAATCAAG 
GftGCCUUVTT 



11 
1 

TTCEAATACT 
6CCXX3GAAAA 
AGA'EAAACAA 
XCCAAGAAAT 
TGGAQATGAT 
CAGOGGGGCC 
ACATGAACCG 
TTACCCCCTT 

cccgtggagc 
cttttggaoc 
ctacacattc 

TAQOTCTTQa 

acatcaacac 
acccaaaaga 
ccaatttgag 
ggttcttctg 

TATOGCCAAC 
TTTTTCTTTT 
ATCCCAAGAG 



21 
I 

GCTCCTGCAO 
AAATAATGTG 
ACXTOCAGTQ 



GCAOGCRiCCr 
CGTATGGAGG 
TGRGGATOTT 
GAAATTCAGC 
XCATGGAGAC 
T0GATCTG6C 
AGGAGGCACA 
CCATTCTAGT 
ATTTCGCCTC 
GAACCAAOGC 
TTTTGATGCT 
GCTGAAGGTT 
CTXQOCATCT 
TAAAGATGAC 
CATACa^TTCT 



31 
I 

GOCACTGCXT 
CTATTTGCTG 
ACAAAAATGA 
TTGGGTCTGA 
GGATGTGGAG 
AAACATTATA 
GACTACGCAA 
AAGATTAACA 
TTOCATGCTT 
ATTGGAGGGG 
AACTTGTTCC 
C3ATCCAAAGG 
TCTGCIGATO 
TTGCCAAATC 
GTCACTACOG 
TCTGAGAGAC 
GGCATTGAA6 
AAATACTOGT 
n!TGGTTTTC 



41 

I 

CTGOAGCTCT 
AAAGATACTT 
AATATAGTGG 
AftGTGAGCGG 
TCCCCGATGT 
TCACCTACAG 
TCOaOAAAGC 
CAGGCATSGC 
TTGATGGCAA 
ATGCACATTT 
TCACTOCTGT 
CCGTAATGTT 
ACATACOTGG 
CTQAGAATTC 
TGGGAAATAA 
CAAAGACCAG 
CTQCTTATQA 
TAATTAGCAA 
CTAACTTTGT 



51 
I 

TCCCCTGAAC 
AGAAAAATTT 
AAACTTAATG 
GCAACTGGAC 
CCATCATTTC 
AATCAAT7AT 
TTTCCAAGTA 
TGACATTTTG 
AGQTGGAATC 
CGATGAOGAC 
TCACGCCATT 
CCCCACCTAC 
CATTCAGTCC 
AGAACCRGCT 
6ATCTTXTTC 
TGTTAATTTA 
AATTGAAGCC 
TTTAAGACCA 
GAAAAAAATT 



2160 
2330 
2280 
2340 
24O0 
2460 
2520 
2580 
2640 
2700 
2760 
282D 
2880 
2940 
3000 
3060 
3120 
31SD 
3240 
3300 
3260 
3420 
3400 
3540 
3600 
3660 
3671 



60 

120 

XQO 

240 

300 

360 

430 

480 

540 

60O 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

X200 

1260 

3.320 

13BO 

X413 



60 

120 

180 

240 

300 

360 

420 

480 

£4Q 

600 

660 

720 

780 

840 

900 

960 

1020 

108O 

1140 
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10 



GATBCAGCTG TTTTTAACCC ACGTrTTTAT AGGS^OCTACT TCTTTGTAGA TAACCA6TAT 1200 

TGGAGGTATG ATGAAAGGAG ACSiOATGATG HACCCTGaTT ATCCCAAACT GATTACCAAG 1260 

AACTTCCAAG QAATCSGGCC TAAAATTGAT GCAGTCTTCT ACTCTAAAAA C3»ATACTAC 1320 

TATTTCTTCC AAGGATCTAA CCaATTTQAA TATt3ACTTCX: VMrrCCRTkCG TATCACCAAA 1380 

ACACTOAAAA GCAATAfiCTG 6TTTGGTTGT TGA 1413 

Seq ID HO? C82 DNA Sequence 

Nucleic Acid Accession ft; MM_0Qfi952.i 

Codi ng sequence i 11 . . 7 93 



1 11 21 31 41 SI 

1 1 t 1 [ 1 

AATOCOGACA ATGGCGAAAS ACAACXCflAC TGTTCX3TTGC TTCCAGfOGCC TGCTGATTTT 60 

I _ TOGAAATQTB ATTATTGGTT GTTGC3SGCAT TCJOCCTGACT QOGGAOTQCA TCTTCTTTQT 13 0 

ID ATCTGAOChA CaCAGCCTCT ACCCACTGCT TGRAGCCACX: GACAACGATG ACATCIATGG ISO 

GGCTGCCrOO ATCOC3CATAT TTGTGGGCAT CTQCCTCTTC TGCCTGTCTG TTCXAGGCAT 240 

TQTAGGCATC ATGAAGTCCA GCAGGAAAAT TCTTCTGSCG TATTTCATTC TGtO&mAOr 300 

AGTAXATGCC TTTOAAGTGG CATCTTGTAT CACAGCAGCA AC3«»ACQAG ACTTTTTCAC 360 

ACCX»CCTC TTCCTGAAfiC AGATGCTAGA GAOSTACCRA AACAACAGCC CTCSCAAACAA 420 

T6ATQACCAO TGGAAA^CA ATGGA(3TCAC CAAAACCTGG GAOUaGCTCA TGCTCCAGGA 480 

CAATTGCTGT 060GTAAATG STOCATCAGA CTOGCAAAAA TACACATCT6 CCTTCCGGAC 540 

IGAGAATAAT GRTQCTGACT ATCCCTGOOC TCQTCAATGC TGTBTXATOA ACAATCTTAA 600 

AOAAOCTCTC AACCTGGAGG CTTGTAAACT AGGOGTGCCT GGTTTTTATC ACAATCAGGG 660 

CTGCTATTGAA CTGATCTCTS QTCCAATGAA CCGACACGCC TGGGGGOTTS CClGGTTTQQ 720 

ATTTGCS2ATT CTCTGCTGGA CTTTTTGGGT TCTCCTGGGT ACCATGTTCT ACTG6AGCAG 7B0 

ftATTGAATAT TAAGAA 795 

Seq ID HO: C83 DHA Seguence 

Nucleic Acid AcceBsion fti ll»_001793.2 

Ccsdlng sequence: 71 -.2560 

I 11 21 31 41 51 

t I i I i ] 

AAflGGGGCRA GAGC7GAG06 GAACACCGGC COGCOGTOGC GGCAGCTGCT TOVCCCCTCr 60 

CTCTGCAGCC ATGGG6CTCC CTCXJTGGACC TCTCGCGTCT CTCCTOCTTC TCJCaGGTTTQ 120 

CTQG CTOCAB TGCGOGGCCT COGAGCC3GTG C0GGG0C3GTC 7TCAGGGAGG CTGAA63GAC 180 

CTTGGAGGCG GGAGGCGCGG AGCftGGAGCC CQGCCAGGCG CTGGOGAAAG TATTCAIGeS 240 

CTOCCCTGOQ CAAGAC0CA6 CTCTGTTTAG CACTGATAAT OATGACTTCA CTGTGCGGAA 300 

TGGOGAGACA CraHMSGAAA QAAGGTCACT GflAQGAAAee AATCCATTGA AJ3ATCTTCCC 360 

ATCGSVAAGOT ATCTTAOGAA GACACAAGAG AGATTGGGTG OrTGCTCCaVA TATCTGTCCC 420 

TGAAAATGGC AAOGGTCCCT TCCCCCAGflG AXTTGAATCAS CTCAAGTCTA ATAAAGAXAB 480 

AGflCACCRAG ATTTTCTACA GOVTCAOGQG GCCGGGGGCA GACAQCX:CX:C CTGAGGGTGT 540 

CTTCGCTGTA 6AGAAGGAGA CftSGCTGGTT GTIGTTGAAT AAGCCACTGG ACCGGGAGGA 600 

. GATTGCCAAG TATQAOCTCT TIGGCCAOSC TGTGTCftQAG AATGOTGCCT C3U3TGGAGGA 660 

^3 CCOCATGAAC ATCTCCATCR TOQTOACXSA CCAfSAATGAC CACAAGOCCA AQTTTRCCCA 720 

GGACACCTTC CGAGGGAGTG TCTTAGAGGG AGTCCTACCA GGTACTTCTG 3X3ATGCAQGT 780 

GACAGCX3UX3 GATGAG6ATQ ATaCX3VTCTA CACCXACRAT GGGGTOGTTG CTTACTCCaT 840 

CCaTAQCCaA QAACCAAAGG AOCCACAOGA CCTCATGTTC AOCATTCAOC GGAGCACAGG 900 

GACCATCftGC GTCKXCTOCA. GTQGCCTQQA COGGGAAAAR GTCCCTGftGT ACACACTGAC 960 

Z>U CATCCAGGOC ACAGACATOG ATOGGGAOGG CTOCACCaMCC AOGGCAjgTGG C3«3TAGTGGA 1020 

QATCCTTGAT GCCAATGACA ATGCTCOCaaC GTITGRCOCC CRGAAGTACS AGGOCCATGr lOBO 

GCCTGftGAAT GCAOTGOGCC ATGAOGTaCA QAQGCTOACQ QTCACTOATC TQGAOOCSCCX: 1140 

CAACTCaCCA GOGTGGOGTG CCACC3ACCT TATCATGGGC GGTGAOGAOG GGGftOCATTO 1200 

TA0CATC3VGC ACCCACCCTQ AQAOCSUkCCA GGGCATCCTG ACAACCAOGA AGGGTTTtSGA 126Q 

TITTGAGGCC AAAAACCA6C ACAOCCTGTA OGTTGAA3TG ACCftAOQAGO CCCCTTTTCT 1320 

GCTOAAGCTC CCAAOCTCC3V CAGCXMCAa: AGTGGTOCAC GTGGA0C3ATG TCAATGaOGC 1380 

ACCTGTQTTT GTCCCRCCCT CCAAAGTCGT TCAQOTCCAJa GAGGGCATCC CCACTGGQGA 1440 

GOCTGTOTGT GTCIACftClG C&GAAGACGC !£GA(3\AGGA6 AATCAAAAGA TCAGCCAOCG 1500 

CI\XCCT6AGA GAGOCAGCAa GGTGGCTAQC CATQOACQCA QACAGTGGGC AGGTCAC1\GC 1560 

OU TGTQGaCAOC CT0GA006TG AG^TGAGCA GTTTGTGAGG AACJiACATCT ATGAAGTCAT 1620 

GGTCTTQGOC ATGGACAATG QAAQCCCTCX: CACCACTSGC AOaQQAACCC TTCTOCTAAC 168Q 

ACTOATTGAT GTCAATGACC ATGGCCCRGT CCCTGAGCCC OGTCAGATCA OCATCTGCAA 1740 

CC AAAGC CCT GTGOGCCAGG TGCTGAACAT CACQGACAAG GACXniGTCTC OO^ACACCTC 1800 

wr^ CCCrrrOCAG GCGC3U3CTCA CAOATQACIC ASIbCWrCXAC TGaAOOgCAO AGGTC3UU3GA 1860 

03 QOftAGOTQAC ACAGTOGTCT TGTGOCIGAA GAASITOCT6 AAGCAGGATA CATATGAOGI 1920 

GCACcnrcr ctgtctoacc atqgcaacaa agagcacsctg aC3qgtqatca ggoccrctgt 19 bo 

GTGCOACTQC CATGGCCATG TCXSAAAOCTG CCCTOGAOCC TGGAAGGGAG GTTTCATCCT 2040 

CXXrTGTGCTG QGGGCTQTCC rGSCrCTGCT GrTGCTCCTG CrQGTGCTQC TTTTHTTQGT 2 1 DO 

TOGAGAAAQAao OQSaftGATCA AGGftGCCCer CCTftCTCCCA GAAGATGftCA COCGTGACAA 2160 

OGTCITCXAC TATGGOGAAG AGGGGGGTGO OaAAGAOOAC CRGGACTATG ACATCACCCA 2220 

GCTCCAOCQtA GGTCTG6AGG CCAGGOCGGA GGTGGTTCTC CX3CRATGAC3G TGGCACCAAC 2280 

CATCATOCOG ACACCCATGT ACOGTGCICXS GCXZftGOCAAC CCAGATGAAA TCGGCAACTT 2340 

™^TAATTQAa AACX:rGRAGG CX3GCTAAC3«: ASACCOCftCA GCOCOGCCCT ACGACACCCT 2400 

rj^ CTTGGTGTTC GACTATGAGG GCftGCOGCIC CSACGCOGCQ TCOCTGAGCT CCCTCACXITC 2460 

/J CrCXSCCTCC GACCAAGACC AAGATTAOSA TTATCT(3AAC Gt^GXGGOGCA GCXXSCTTCAA 2520 

GAAGCTC5GCA GACATGTACG QTGGC3GGGGA OGACXSACrAB GOGQCCTGCC TGCAGGGCTG 25 80 

GGGAOCAAAC GTCAGGCCAC AGRGCATCTC CAAfiGGGlCT CftGTTGCOCC TTCawSClXSAG 2640 

GACTTOGQAG CTTGTCAGGA. AGTGGCX33TA QCAACTTOOC GOAOACIkGaC TATGAGTCTG 27 OP 

Ofk AC3GTTftGAGT GGTTGCrEOC TTAGCaCTTTC AfiGATGGAGG AATGTGOOCA. GTTOSGACXTC 2760 

oU AGCACTGAAA ACCTCTCCAC CTOGQOCAGG GlTaCX3?CAG ACGCCAAGIT TCCaGAAQCC 2820 

TCTTACCTX3C CBTAAAATGC TCAACCCTGT GTCCTGGGCC TGGOCCTGCT OTQACTGACC 2880 

TACAGTGGAC TTTCTCTCTG GAATGOAACC TTCTTAGGCC TOCTGGTGCA ACTTAATTTT 2940 

TTTTTITAAT GCTATCTTCA AAAOGTTAOA GAAAGTTCTT CAAAABTGCA GCCCAGAGCT 3000 

GCTGGGCCCA CTGGCC5GTCC TGCftTTTCXQ GmCCfilGAC KXCMOGCCT OCCATTOGGa 3060 
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15 



20 



30 



TQC3ATCTCr& CGTTTTTATA CXOAGTGTGC CTAGQTTGCC CCTTATTTTT TATTTTOCCT 3120 

GTTGCGTTQC TATAGAIOAA GGGTGAQGAC AATCGTGTtAT ATGTACTAGA ACTTTTTTAT 3180 

TAAAGAAACT TTTCCCAGAA AAAAA 3205 

5 &eq ID HOs CB4 DNA Sequence 

Nucleic Acid Accession «i in>f_005629.1 
Ctoding sequence: €3 9.. 2546 ~ 

in ^ 1^ 21 31 41 51 

TAGTCGGAGC 6AGC3TGGCGA GTCGCTGAGC CCGCCGC3GGC CCCGflOftGCG GCTGCAGCCG tiO 

COBCGGCCGG GAAQGAGAG6 CCGAGQCGCG CCCQAGCOGC GGCOQOOGCC 6CC3UX33CCG 120 

COGCOSCCAC CRC06CCACC GC3AGTCX3CGG GCX:AGCCGG6 CAGCCTCXTQC GGGCGOOGGC 180 

CGGGGCGOGG GGCGCGGGCC ACftiGGCCCCT GCTCXJGGCCG TCX3TTTQCAG ACOSCGGGCG 240 

CCX3ATQTOGC CCGCGCCGCO TTAGGATGAG TCTOGGGTCG GGOGAOGAGC OGOCGCRGCC 300 

GCCCCCGCOC QAGCOGCCSGG CAGGAGCCTC GGQAGCCC3CC GCGGOCSCOG GOGCOGCXXX? 360 

GCCOOGCCCX: GACGCOGGCC GCGCGOCCCC QGGOCCCOGA CACACftTGftS ATTCTTCAGG 420 

CTCACTTTCA AQTGCTTCGT GGACTGCTTC TQACTGCaCC GCCCGCGCCC CGCACCX3CaC 480 

OGTCCGCCCG OCXSCCCCfiTC CCX3CGGCCCX5 GO^GCCCCCC GGCCCCOGGC CGGCCOGOGC S4D 

CCrrOQGGGCC CrCCCX3GGT6 CCGCOGGTGC CCCCXS3CCTG ACGGCOGOCC CCCQTQAGGC 600 

GCCGG8ACCC OGGCCOGQCC GTGOOSGCCG CCGGGGCX».T OGCX3AAGAAS AGGGCC3GAGA 660 

ACGGCATCTA TAGOGTGTCC 06CGACGAGA AjSAAGGaCCC CCTCATOGOG CCCGGGCCCG 720 

ACGGQQCX:CC GGOCAAGGGC GACGQCCJCCG TGGGOCTGGG GACACCOGGC GOaC3GCCTGG 780 

COGTQCOGCC GCGOSAGACC TGGAOQOGKX: AfiATGGACTT CATCATGTOG TGCGTGGQCT 840 

-^-> TCGCOGIGGG CTTGGGCAAC GTGTQGCaCT TCJCCCTACCT GTGCTACAftG AflCGQOGGAG 900 

GIGTOTTCCT TAXTCCCTAC GTCCTGATOG CCCIGGITGG AGaAATCCX:C ATTTTCTTCT 96D 

TACAGATCCC GCTGGGCCAG TlCAlGAAGG CGBQCftQCST CAATGTCTGG AACATCTGTC 1020 

OCCTOTTCAA AGGCCTGGGC TACOCCTOCA TCGTGATOGT CTTCTACTQC AACRCCTACT 1080 

ACATCATCGr GCTGaCCTGG GGCTTCTATT AtXTrGGTCAA QTCCTTTACC ACC&CGCTQC 1140 

OCTGGGCXZAC ATGT6GCCAC ACCTGGAACA CTOCOGACTG CGTGGAGATC TTCCGCCAT6 1200 

AflgaCT GTGC CAATGCCAGC CIGGGCAAGC ICACCIGTGA CXnOCTTGCT GAOCGCOGGT 1260 

OCCCTGTCAT CGA6TTCTGG GAGAACAAAG TCTTOAGGCT GTCTGGGGGA CTQGABGTGC 1320 

CACSGGGCOCr CAACTQGGAG GTGAOCCTTT GTCTGCTGGC CTtSCTGGGTG CTGGTCTACT 13B0 

^ _ TCTGTGTCTQ GAAX?GGGGTC AAATCCACX?G GAAJVGATCGT GTACTTCaCT QCTACATTOC 1440 

CCTAOSTGGr CCTtSQTCQTG CTCCTGGTGC GTGGftfilGCT QCraCCTGGC GCCCIOGATQ 1500 

GCATCATTTA CTATCTCAAG GCTOACZTQQT C&AAGCTCGG 6TG0CXn:CAfi GTOTGGATAG 1560 

ATQCGGGGAC CCAGATTTTC TTTTCITAG6 OCATIGGCCT GGGOaCCCTG ACAGCCCTGG 1620 

GCftGCTACRA CCGCTTCAAC AACAACTGCT ACAAGGACGC CATCATOCTG GCTCTCATCA 1680 

ACaaTGGGAC CAGCTTCTTT GCTGGCTTCG "rGGrCTTCrC CaTCXTTGGCSC TTCATGGCTO 1740 

CAGAGCAGG3 GGTGCACATC TCCAAGGTOG CAGAGTCAGG GCOGGGCXTTG GGCTTCATC3G 1800 

CCTACCCGOG GGCIGTCAC3 GTGATGGCAG TGGCCCCACX CTGGGCTGOC CTGTTCITCT 1S60 

TCATGCrGrr GCTGCTTGCT CXOGACAQCC AQTTTGTAGG TGTGGAiSGGC TTC3VTCACCG 1920 

GCCrcCTOGA CCTCCTGCCG GCCTCCTACT ACXTtXETTT CCAAAOGGAG ATCTCTGTGG 1980 

CCCTCTGTTG TaCXX:TCTGC TTTQTCATCO ATCTCTCCAT GGTGACTGAT GOaJQGATGT 2040 

ACQTCTTCCA GCTQTTTGAC TACTAlCTCGG CCAGCGGCAC CACCCTEQCXC TGGCAGGGCT 2100 

TTTGGGAGTQ CGTGGTQGTG GCCIGGQTGT AOGGAGCIGA 0GGCTTC2ATa GACQACATTG 2160 

CCIGTATGAT OGGGTACCGA CCTTGCCOCT GOATGAAATG GTGCTGGTCC TTCTTCACCC 2220 

C3C3CTGGTCTQ CATGGGCATC TTCATCTTCA ACGTTGTGTA CTACSAGCCG CTGGTCTACA 2280 

ACAACACCTA CGTGTACCCX3 TGGTGGGGTG AGGCCATlSQa CTGOaCCTTC 6CCCTG10:t 2340 

CCATGCTGTG CGTGCOGCTG CZ\CCTCCTQQ GCT G OCTCCT CAGGGCX!AAG GGCACCATGG 2400 

CIGhQCGCIG GCAGCACCia AOCCAGCCCA TCTGGGGCCT CCACCACTTQ GftGTACOGAG 2460 

CTCAGGACGC AGATOTCAGG GQCCIOACCA CXCTGACCCC AGTGTOOSHS AGCAGCAAGG 2520 

TCGTOSIGQT GGAGAGlGTC AIQTOACAAC TCAGCTCACA TCACCAGCTC ACCTCTGSTA 2580 

GCCATABCAG OOCCTOCTTC AGCCOCACGG CACCCCTCCA OGGGGOCTGC CTTTCCCTQA 3640 

C ACTTT TSSG GTCTGOCTGG GGGnQQAaOG 6AGAAAOCAC C»G2USTGCT CACTAAAACA 2700 

ACTTTTTCCA TTTXTAATAA AAOTCCAAAA AXAVCAChAC GCSUXAAAAA TAGATGCCTC 2760 

TCCCXCTCm QCC3CTAGCCG AQCTOGTCCT JW3©CCCC3GOC TmxQCCCCn CCCCCAOTCA 2820 

CftSlQCrqCA CTCCTOCTOC CCCTOOCACS CCX2ACCC(3CT QCOCACCTCT CXSUSGCTCTQ 288 0 

CrCTGCAGCA CAOCC3GTGGG a?GACOCCTCA OCOCAGAAGC AGCAGTGGCA GCTTQaOAAA 2940 

IGIXSAGGAAG GGAAGGAGOG AGAGACGGGA GGGAGGAGAG AGAGQAOAAO GGAGGCAGGG 3000 

GADOOGCAGC AGAACCAAGG CRAATATTTC AGCTOOGCTA ^EACCOCTCTC COCAXOCXrTG 3060 

TTATAGAJ^C TTAGfiGAGCC AGCCACSCAAT QGAACCTTCT GGTTCCTGGB CCAATC3GCCA 3120 

CCRGTATCM TTGTGliGAaC TTGGGTGCGA GTGCACXSCXSr QCGTOAaTAC GGAGAGTAaa^ 3180 

TAZAQATdC TATCTCTTAG CRAAQGTQAA TGCCAGATGT AAATGGC60C tClGOQCAAA 3240 

06AGGCTTGT ATTTTGCACA TTTTATAAAA ACTTGAGAGA AlGftOATTTC TGCTIGTATA 33 OO 

TTTCTAAAAA GAGOAAGOAG CCCAAACCAT CdCTCCTTA CCACTCOCAT GCCXOTGAGC 3360 

CCTACCTTAC CCCTCTGCCC CTAGCCAAGG ftSTGTGAATT l&TflGATCTA ACTPTCATftG 3420 

GCAAAACAAA AGCTTCOAGC TGTTGCGTGT GTOAGTCTaT TGTGTGGATG TGCGTSrOTG 3480 

„ qrCCOCiftQOC CXAGACXGGA TTGGAAAAGT GCAXGGTGGG GGCX^^CXSGGG CTGTCCCCAC 3S40 

/U eCTOTOCCTT TGCCACAAGT CTGTQOQaCA AGAGGCTGCA ATATTCOSTC CTGGGTGTCT 3600 

GGGCTGCTAA OCTGGCCaTGC TCAGGCTTCC CaCCCTGTGC GOGGCACACC CGCAGGAAG6 3660 

GACCCTGGAC AOaaCTCCCA CGTCCAGGCT TAAGGTGGAT 6CACTTCOCX3 C3hCCTCCAGT 3720 

CTTCTGTGTA GCAGCTTTAA CCCAOGTTTG TCTGTCAOGT CC3^GCGGA GAOSGCTGAG 3780 

TGACC0C31AG AAAOGCTTCC CCGACACCCA GACAGAGGCT GCAGGGCTGG QacrGGGTGA 3B40 

GGGTOGOSGG CCTGOGGGGA CATTCTACTG TGCTAAAAAG OCACXGCAGA CATAQCAATA 3900 

AAAACAtGTC ATTTTCC 3917 



40 
45 
50 
55 
60 
65 



75 
80 



©eg JD KOs C8S DNfA Sequence 

Nucleic Acid Accession tfz NM_O0 6516.1 

Coding sequences 180..ie56 

1 11 21 31 41 51 

1 I I I I I 

TABXaSGGGG TCCC3CQAGTG ASCAOSCCAG OGAGCAGOAG ACCAAACOAC GGGGGXCGGA 
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GTCAGAGTCG CAGTGOGAGT CCC0GCSACC6 GAGCACQAGC CTGAGCGGGA GAGGBCCBCT 120 

OaCACGCCOG TOGOCRCCOG CQTACCGOGC GCfti3CCM?A6 CCRCCRGCX3C AGGGCTGCCA. 180 

TGGAGCCCAC CAQCAAOAAG CT(3ACGGGTC GCCTGATGCT GGCTGTGGGA OG?M3CA(3TaC 240 

ITGGCTCCCT GCAGTTTGGC TACAACACTG GAGTCATCAA TGCCCCCCAG AAOOTGATCG 300 

J AOGAGTTCTA CAACt3VGAjCA. TGG6TCCACC GCTATGOGGA GAGCATOCTG CCCAOCACGC 360 

TCAOCACGCT CTGGTOCCTC TCAGTGGCCA TCTTTTCTGT TGOGGGCATG ATIG8CTCCT 420 

TCTCTGTGGG CCTTTTCGTT AACCGCTTTG GCCGGCGGAA TTCAAT6CTG ATGATGAACC 460 

TGCTGGCCTT OGTGTCCGCC GTGCTCATGG GCTTCTCGAA ACTGGGCAAG TCCTTTGAGA 540 

TGCTGATCCT GGGCOSCTTC ATCATCGGTG TGTACTGCGG CCTGACCACA GGCTTCGTGC 600 

lU OCATGTATGT GGGTGAAGTG TCACCCACAC CCTTTCGTGG CGCOCTGGGC ACCCTGCACC 660 

AGCTGGGCAT CGTOSTCBGC AXCCTCATCG CCCAGGTGTT GGGCClGGAC TCCATCATGG 7ZD 

GCAACAAGBA CCTGTX3GOCC CTGCTGCTGA GCSV-TCATCTT CATCOCGGOC CTGCTGCAGT 780 

OCATCGTGCT GCCCTTCTGC CCCGAGAGTC CCOGCTTCCT GCTCATCRAC CGCAACQAGG 040 

- - AGAACCX5GGC CAAGAGTGT6 CTAAAGAAGC TGCG03GGAC AGCTGACGTG ACCCATGACC 900 

1^ TGCAGGAGAT GAAGGAAGAG AGTCGGOUSA TGATGOGGGA GAAGAAGGIC ACCATCCTOG 960 

A6CTGTTCCQ CTCCCOCGCC TACOGOCAGC CC^TCCTCAT C6CTGTGGTG CTGCA6CTGT 1020 

CCCAGCAGCr GTCTGCSCATC AAOQCTGTCT TCTATTACTG CAOGAGCATC TTCGAGAAOG XOBO 

COGGG6IGCA GCflaDCTGTG TATGOCACCA TTOGCTCCGG TATCXJTCAAC ACGGCCTTCA 1140 

CTGTC3QTGTC GCTGITTGTG GTGGAGCGAG CAGGCCGGCXS GACXXIXGCAC CTCATAGGCC 1200 

Z\) TCGCTGGCAT GGCGGGTTGT GCCATACTCA TGACCATCGC GCTAGCACTG CTGGAGCAGC 12 €0 

TACCCTSGAT GTCCTATCTG AGCATCGTOG CCAXCTTf GG CTTTGTOGCC TTCmGAAB 1320 

TGGGTCCTGG CCCXa"KX3CA TGGTTCATCG TGGCTGAACT CTTCAGCCAS GGTCCACGTC 13 BO 

CAGCTGDCAT TGCCCSTTGCA GGCTTCTCCA ACTGGACCTC AAATTTCATT GTGGCSCATGT 1440 

^ « GCTTCC3U3TA TGTGGAGCAA CTQTGTGGTC CCTACX3TCTT CATCATCTTC ACTQTGCTCC 1500 

Z5 IGGTTCIGTT CTTCATCTTC ACCIACTTCA AAGTTCCTGA GACTAAASGC OGGACCTTOG 1560 

ATCAGATOGC TTCC3S6CTTC C3GGCAGGGG6 GAGOCAGCCA AACrCSATAAG AjCACCGBAQQ 1620 

AGCTGTTCCA TOCCCTGGGG GCTGATTCCC AAGTGTGAGT C3GCCCCAGAT CACCAGOCOG 1680 

OCCTGCrrCCC AGCAGCCCTA AGGATCTCTC AGGAGC3«3W3 GCAGCTGGAT G2VGACTTCCA 1740 

AACCTGACAG ATQTCAQCCa AGCCOGGCCT GGGGCTCCTT TCTCCRQCCA GCAATGATCT 1800 

JU IX3U3AAGAAT ATTCftGGACT TAAOGGCTCC AGGATTTTAA CAAAAGCAAG ACIGTIGCIC 1B60 

AAATCTATTC AGACAAGCAA GAGQITTTAT AATTTTTTTA TT%CTaATTT TOTXATOTTT 1920 

ATATCAGCCT GASTCTCCTG TGCCCAC^TC CCAGGCTTCA COCTGAATOG TTOCATGOCT 1960 

GAGGGXGGAG ACTAAGCCXTT GTCOAGACAC TTQCCTTCTT CACCCAQCTA ATCTGTAGGG 2040 

CTGQAOCTAT GTCCTAAQGA CACACTAATC GAACTATGAA CTACAAAfiCT TCTATCCCAG 2100 

Jo GAGGTGGCTA -IGGCCACCOG TTCTGCTGGC: CTGSATCTCC CCACTCTAGO GGTCAGGCTC 2160 

CATTAGQATT T6CCCCXTCC CATCTCTTCC TACGCAACCA CTGAAATTAA TCTTTCTrfA 2220 

CCTGAGACGA GTTGGGAGCSPi CTGQAGTGCA GGGAGGAGAG GGOAAlOGGCC ASTCTGOaCT 2280 

GCCXSQGTTCT AOTCTCCTTT GCACTGAGQG CCACACTATT ACCATGAGAA GRiaGGCX:TGT 2340 

GGGAGCCTGC AAACTCACXG CICAAGAAGA CATOGAGACT CCTGCCCTGT TaTOTATAGA 2400 

4U TGCAAGATAT TTATATATAT TTTTGGTTCT CAATATTAAA TACAGACACT AA6TTATAGT 2460 

ATATCIG6AC AAGGCAACTT GTAAATACAC C3UCCTCACTC CiGTmCTXA GCTRAACSUSA 2520 

TATAAATOOC TOGTTTTTAG AAACATGGTT TTGAAATGCT TGTGGATTGA GGGTAGGAGG 25B0 

TTTGGATGG6 AGTGAGACAG AAGTAAGIGG GGTTGCAACC ACTQC7VACGG CTTAQACTTC 2640 

. GACTCAGOAT CCAGTO^CTT ACAOGTACCT CTCATCAGTS TCCTCTTQCT CAAAAATCT6 2700 

HD TTTGATCCCT GTTACCCAOA GAATATATAC ATTCTTTATC TTGACATTCA AGGCA7TTCT 2760 

ATGACATATT TGATAGTTG6 T6TTCAAAAA AACACTAGTT TTGTGOCAGC CXSXGKraCTC 2820 

AGGCTTGAAA XCGCATTATT TTOAATOTGA AGGSAA 3856 

Seq TO VOi CB6 "DSR. Sequence 
50 Biacleic Acid Accesaiom #: XM_035292-2 
ODding sequonces 53..1S76 

1 11 21 31 41 51 

! 1 I 1 I i 

GCrCGCTQGa CCXSGG6CTCC OSGGTGTOCC AGGGOOGGOC GGTGOGCAGA GCATOGGGGG 60 

T(3C33GGCOCG AACOGGGGOS OQCTAGCBGC 6CCG60GGOC GAGOASAAGQ AAfaAGQCGOG 120 

GSASAAiSATQ CTGOCOGCCA AGAfiCGOSGA CGGCTGGG06 GOGGCAGGGG ASGSCGAfiOQ IBO 

GGTGAOCCTG CAGCGGAACS^. TCACGCTGCT CAAGGGGGTG GGCATGATCG TSGGGAOCAT 240 

TATC3GGCTCO OGCATCTTCX} rraACGOCCAC GGGCGTGCTC AAGGAGGCRG GCTOGCOGGG 300 

DU GCTGOGGCTG GTGGT6TG6G CG6GGTGOG6 OSICIXCTCC ATGGrGGQCG OGCTCTGCTA 360 

08C9GGAGCIC GGCAlOCaGCA TCTCCAAATC GGOaSGCGAC TAGQCCTACA TGCTGGAGGT 420 

CTACGGCTOa CTGCOOGOCT TCCTCRAGCT CTGGATOGAG CTGCTCATCA TCCCX30CCTTC 480 

ATasaUxTAC ATOST06GOC TGGTCTTCGC i::ACCrACClG CTCSUUSCOGC TCITCC3CCAC 540 

^ CTGOCCQ6TG CCCQAOGAGG CAGCCAAGCT OGTGGCCTGC CTCTGCSGTGC TGCTGCTCAC 600 

05 GGCGQTGAAC TGCTACAGGG TGAAGGCCGC CAJC3CGGG6TC CAGGATGCCT TTQCCSCXXaC 660 

CAAGCTCCTG GCXXTOGCCC TGATCATCCT GCTOSGCTEC OTOCAGATOG GAAAOGGTGii 720 

IGIGTGCAAT CTAGATGCCA ACTTCTCATr TGAAGGCACC AAACIGGAIG TGGGQAAOIT 7B0 

TGTGCTGGCA TXATACAGCQ GCCTCTTTOC CTATGGAGGA TGGAATTACT TGAATTTCGT 840 

^ CACAOAGGAA ATGATCAAOC CETACAGAAA CCTGCXTCCTG GCCATCATCA TCTCCCTQCC 900 

/U CATCGIGACX3 CXGQTGTAOG TGCTGACCTU^ GCTTGGCCTAC TTCACCACOC TGTCCACCGA 960 

GCAOATGCTG TOGTGCGAGG CaSTGGCOQT GGACTTCaSGO AACXAIOUSC TOGSOGTCAT 1020 

GTGCI6GATC ATCCMCGTCT TGQTGGGCXrr GTOCTGCTTC GGCICCGTCA AXGOGTCCCT 1080 

GTTCACATOC TCCAGGCTCT TCTTCGTGGG GXCCCGGGAA GGGCACCTGC COTCXaVTCCT 1140 

CTCCATGATC CAOCCACAGC TCCTCACCOC CGTGCCGTCC CTOGTGTTCA GGTGTGTGAT 1200 

/5 GAOOCTGCTC TACGCCXTCT CCAAGQACAT CTTCTCOGTC ATCAACTTCT TCRGCTTCTT 1260 

CAACTGGCTC TGCGTGGCOC TCGCCATCAT CaiGCATGATC TOGCXGOGCC ACAQAAAlGCC 1320 

ITQAGCTTGAG CG6CCCATCA AGGTGAACCT GGCCCTGCCT GIGTTCITCA TCCTGGCICTO 1380 

CCTCTTCCCG AT0GCXX3TCT OCTTCTGGAA GACACCCQTO GAGTGTGGCA TOGGCTTCAC 1^40 

CATCATCX3C AGC3QGGCTGC CCGTCTACTT CTTOGGGGTC TGGTQGAAAA ACAAGCCCAA 1500 

oU GTGGCTCCTC CAGGGCATCT TCrCCAC2QAC CXSTCCTGTGT CAGAAGCTCA TGCAGGTGGT 1S60 

CCCCCAGGAG ACATAGCCAG GAGGCCGAGT aGCTGOC3QGA GGAGCATGC 1609 

Seq ZD MO: C87 DNA Sequence 

EAiclelc Acid Accession fls MM_00526e.l 
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Ccadlng seqaezice: 16B..989 

3. 11 21 31 41 51 

I I I i ] I 

TAAAAftGCAA RAGAATTOGC GGOCGa?rCG ACACQGGCTT CCCCGAAAAC CTTCCCCGCT 60 

TCTGIS&TAXG AAATTCAAGC TGCTTOCTGA GTCCTATTC3C OGGCTGCrGG 6AC3CCAGG&G 120 

AfiCGCIGAlOG AGTAGTCACT CAGITUSCAGC TOACGCGIOG GTCO^CCATG AACTGGAQTA 160 

TCTTTGAOGG RCTCCTGAGT GGGGTC?UVCA AGTACTCCAC ASCCXTTGGG OGCATCTGGC 240 

TCTCTCTTOGT CTTCATCTTC CGCGTGCTGG TGTACCTGGT GACG<3C3CQACJ CGTGTGTGGA 300 

GTGATG&CCai CAAGGACTTC CACTGCAATA CTQGCCRGCC CGGCTGCTCC AAOGTCTGCT 360 

TTGATQAOTT CTTCCCIGIG TCCCATGIGC GCCTCrGGGC CCTt?CM3ClT ATCCTGGT6A 420 

OVTGCCCCTC ACTQCFCGTG GTCATGCACG TGGCCTACCC3 GGAJSGTrCAS GAGAAGAGOC 480 

ACCGAGAA^ CCATGGGGAG AACAX^TGGGC GCCTCTACCT GAACCXXGGC AAGAAGCX?GG 540 

QTGGGCTCrG CJrtJGACATAT GTCTGCAGCC TAGTGTTCfiA GGCGAGCGTG GACATCaCXTT 600 

TTCXCTATOT GTTOCACTCA TTCTACCCCA AATATATCCT COCTCCTGTG GTCAAGTGCC 660 

AQBCftCTlTOC ATBTOCXKAT ATAGTGGACT GCTTCATCrC GAAGGCCTCA GAQAAGMCA 720 

TTTTCRCOCr CTTCATGGTG GCC3VCftGCTG CX3VTCTGCAT CCTaCTCRAC CTCC3TGGAGC 760 

TCJVTCTACCT GGTGAGCAAG AOATGCCACG AGTGCCTGGC AGCRAGGAAA CSCTCaAQCCR 840 

TGTGCACRQQ TCATCACOCC CACX3GXACCA CCXCTTCCTG CRAACAAOAC GACCTCCTTT 900 

CGGGTGACCT CaTCTTTCTQ GGCTCAGACA GTCRTCCTCC TCTCTTACCA C3AGQGCXrCCC 960 

GAOROCSkTaT GAAGAAAAOC ATCTXGJWa OGGCTGCCTG GACTGGTCTG GCAGGITGG& 1020 

CCTGGATGQG GAGGCTCTAQ a^TCTCTCAT A6GTOCAACC TGAGAGTGGG GGAGCTAA6C lO&O 

CATGAGGTAB GGQCAGGCAA GA6AGA0QAT TCAGACGCTC TOGQAaOCAB TTOCTAGTCC 1140 

TC3iAnrC3CAG OCACCTGCCC CAOCTCS3ACG GCACTGGGCC AGTTCCOOCT CTGCICTBCA 1200 

GCTOGGTTTC CTTTTCTAGA ATQtSAAATAG TGA)GGGCC3^ TGC 1243 

Seq ID SrOs CSS DNA Sequence 
lhacleic Acid Accession #s in4_005130 
Coding eeguAKices 58..ao2 *~ 

1 XI 21 31 41 51 

111111 

CTCTACCTGA CACAtSCTGCA. GCCTQCAATT CACTCCCACT GOCTGOGATT GCACTQGATC 60 

CGTGTGCTCA OAACAAGGTG AACGCOCAGC TGCftGCCATG AAGATCrOTA GCJCTCACCCT 120 

GCTCTCX^TTC CTCCTACTGG CrGCTCAjGGT GCTCCTCGTG GA60GGAAAA AAAAA5TQAA ISO 

GfATGGRClT tnmOCAAAG TGGTCICAGA ACAAAAQGAC ACZCXG6GCA AC3^CAGAT 240 

TAAGCAQAAA AGGAGGCCO& OGAACaUUUSG CaakOTTTGTC ACCAAASftCC AAGCCAACTS 300 

CWSATGGGCr GCTACTGRGC AGGAGGRGGG CATCTCOrCTC AAGGTTfiAGT OCACTCAATT 360 

GGAC3CATGAA TTTTCCTGIC TCTTTGCTGG CAATCCRACC TCATCCCTAA AGCrC3\flGGA 420 

TGAGAGA6XC TATDGGAAAC AAGTTGGC06 GAATCTGOGC TO^CAGAAAS ACATCTGTAG 480 

AIWTTCCAAG AOUSCtGTGA AAACOUSAST GTOCAQAAAG GATTTTOCAG AASrCCASTCT 540 

lAAGCXAGTC AGCTCCACTC TAITTGGGAA CACAAAGGCC AGGAAGGAGA AAACAGA6AT fiOO 

GTCCCCCAGG GftGCACATCA AGGGCAAASA GACCAGCCC3C TCTAGCXTTAG CAGTGACCXZA 660 

GACCRTGGCC ACCRAA6CTC COGAGTGTGT QSAGGACCCA GATATGGCAA ACCAGAGGAA 720 

GACTGCCCTG GfiGTTCTGTO QAGAQACTTG GAGCTCTCTC TGCACATTCT TC3CTCAQCAT 780 

AGTGCAGGAC AGGTCATGCT AATGAOGTCSi AAAGAGAAGG OGITCCrTTA AGAiGaTGTC& 840 

TGTGOTAAGT OCClCTGTAT ACTTTAAAGC TCTCTACAGT COOCCCAAAA TATGAACTTT 9QQ 

TGTGCTTAST GASTGCAACG AAATATTXAA ACAAGTTITG TATTTTTTQC ■ lITi ' ^j ' AXy ' r ' r r 960 

TGGAATTTGC CTTATTTTTC TTQQATGCJQA TGrTCAQAOG CTGTTTOCT6 CAGCATGTAT 1020 

TTOCATGGCC CSICACAGCTA TGTGTTTGAG CAGOGAAGAG TCrElGAGCI GAATGAGCCA 1080 

GaGT GAaaAT TTCASTGCM CGAACmCT GCTGAATTAA TGGTAATAAA ACTCIGGGlG 1140 

TTITTCAAAA AAAAAAAAAA AAA 1163 



f . Seq ID NO: C&9 DlilA Sequence 

nucleic Acid Accession #; BC022S42 



coding sequence: 274. .927 

1 11 21 31 41 SI 

11)111 

ACTTGGTCfX: AG0C6ATAAA TCTGGOGCAG OQCGCGaTAG GAGCTGOGGG OGGCCAGGOC 60 

CCTTCCTGCG ICOGCACCTO GCCCCGOGOG OCCCTCTCGG GOGTCGGQCT TCOSGOOTCC 120 

TGGOGGCIGS OGTGGCGGOG GTTGGG6CX36 OCGOCXGGCT GCTCCIOQGQ GOGGOGAOGG 180 

GGCrC3«3QCG OGQGCCCGCC ACGQCCTTCA CCaOCGOGOG CXCTGROeOC GGCATAAGOG 240 

CCATO TGrXC TGAAATTATT TTGAGGCAAG AAGTTTTGAA AGArTGSTTTC CACAGAGACC 300 

TFi-JtAATCAA AGTGAAGTIT GGGGAAAGCA TTGAGGACrT GCACAjCGTCC GGTCTCTTAA 360 

TTAAACAGGA O^TTGCIGCA GGACTTTATG TCGATCOGTA TGAdSiaXSGCT TCSOTAOGAG 420 

AGAXSAAACAT AACAGAGGCA GT6ATGGITT CAGAAAATTT 7GATATAGAG GCCCCTAACT 4B0 

ArriGTCXAA GGAGTCTQAA GTTCICATTT ATGCCAGACXJ AGATTCACRG TGCATTGACT 540 

OTTTTCAAGC CTTTTTGCCr GTGCACIGOC GCTRTCSi^rCG GCX33CACAGT GAAGATG6AG 600 

AAQCCTOSAT TGTGGTCAAT AACCCAGATT TGTTGATGIT TTGTGACCAA GAGTTOCOGA 660 

TTTTGAAATG CTGGGCICAC TOUSAAGTGG CAGOOOCTITG TGCTTIGGAT AAXGAGGATA 72 Q 

TATGCCAATG GAACAAGATG AAOTATAAAT CAGTATATAA GAATGTGATT CTACAAGTTC 780 

CAGTGGGACr GACTGTACAT AOCTCTCTAG TATGTTCrGT GACTCTGCTC A-TTAOkATCC 840 

TGTGCTCTAC A'TT^TCCTI 6TAGCAGTTT TCAAATATGG CTATTrTTCC CTATAAGTTT 900 

TATGTASrCA AATGCTTCCT AGAAACCTAA ATAAGATCTA TTAATTTCT6 AGGAGAGGTG 960 

ttcttctaga attaattact tttatctttt GTCXTCATTT GTGGCCAAAA TTATGTTOAC 1020 

TAGAGGAAAT TTGGGATCAT TCTCAGCTAA TTCCAAAATG TAGTGCTCTA TIGCATGGAT 10 BD 

CCXTGGtAAX CCTCAAGCAT CAGATGCCAT AAGGGGAAAC TTAATTCTGC TAAATTAATG 1140 

TTTATTTVGT GAGAAGTGAC TTTATCTTCA TTTGGQaTAQ AAAAATTATT TCTTTATGTA 1200 

OTAGAGACAA ATTATTCTCA TTTTGCAAGT ACTTTCAATT TAAGCXACAA ATTOAQAAAA 1260 

OCGTTATAAA TAAGAATAAA ATAGOCXrAGG CACAGTGGCT CACACCTGTA ATCCCAGCAC 1320 

XTTGGSAGGC OSAGGTGGGC GGATCACCAa AQQTCAAGAG TTTGAGACCA GCTTGQTOJUW 1380 

ACX3CTGTCTC TACTAAAAAT ACAAAAGTTA GCTGGGGClG GT6GTQQQCA TCrGTAGTOC 1440 

CMSCVAAITG GAAGGGIGAG GClGOGAGGAT OGCXTGAACC TOGGAGGGGG AGGXTGCA6A 1500 
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gagccaagat 
ggaaaaacaa 
atgtcatgag 
ctaagaaatt 
ajcgcactcca 

GGftClTGATG 
TTGCCATTTT 



CGCACCACTG 

ACTATTAAAG 
AATATTAATA 



AAACTGAGTA 
AAATAAAiGTT 



CACTACAGCC 
TAAAATAATT 
ATGTGCCAGA 
TAAAAATTAT 
ACATTTTATC 
CTAAGATTTG 
GTACATGJiAC 



TOOGOOACAG 
TGGATQAAAA 
6TTTCAATGA 
TGATAATCTT 
ATGTrrCTTT 
GTACAGAGTA 
AAAAAAAAAA 



AACQAGACCC 
TCAT6TTTAT 
AAATC3VTTAA 
AAATTATTGA 
TQAATAIATa 
TGTCAGGAAG 



TGTCTOCAAA 
TTAAATAGTA 
AGTAGGACAG 
TTATTCCTTA 
AATTOQCAAA 
ACAACTCAGA 



1560 
1620 
1680 
1740 
1800 
1B60 
1906 
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Seq ID UO: C90 DMA Sequ^oe 
t9tic;l«ic7 Acid Acceasion #: HM_0 04994 
Coding sagaenca: 30.. 2143 



1 
1 

AGACACCTCT 
OGGCTGCXGC 
CCTGAGAACC 
CACTOGGGrG 

cx:agaa6Caa 
gcgaacccca 

CAAGTGGCAC 
GGCGGTGATT 
CaOCTTCACT 
GCAOOGAGAC 
TGGCCCXSGC 
GGGC6TCGTG 
CATCITGOAQ 
CTGQTGCAGT 
GAtSAClCTAC 
CC3U^0GCCAA 
CGOCACCACC 
CTCOAOSGTa 
GGGTAAGGAG 
TACCACCTC3G 
TTTGTTCCTC 
GCOOGASGCG 
GGAOGTGAAT 
AACCACCACC 
TGTCCAOOCC 

TGCCXGCAAC 
CAASQATOGG 
CCTTATOGCC 
GCTCTCCRAa 
GGTGCTGGGC 
GGGGQCCCIC 
GTT06A0QTG 
CCCCXSGOGTO 

CJCJysGACCXx: 

GGGCTACGTS 

CMACIGGTA 
TCACCTTTGT 



11 
I 

GCCCTCACCA 
TTTGCTGCCC 
AATCTCACCG 
GCAGAQATGC 
CTGTCCCTGC 
CGGTGCGGGG 
CftCC&CAACA 
GAOQACQCCT 
CGOGTGTACA 
GGGTATCCCT 
ATTCAGGCSAG 
GTXCCRACTC 

ACCAOGGCCA 
ACCCG6GACG 
TCCTACTCCG 
GCCAACTACG 
ATGOGGGGCA 
TACTCGACCT 
AACTTTGACA 
GTGGOGGCGC 
CTCATOTACC 
GGCA70CGGC 
ACACOGCAGC 
TCfiGftGCGCC 
ACTGCTGGCC 
GTGftACATCT 
AAGTACTGGC 
GACAAGTGGC 
A&GCTTTTCT 
CC33AGGCGTC 



AAGGCGCASA. 
CCTTTGGACA 
TICTACTGGC 
ACCTATGACA 
GIAhftTOCCC 
TTCTGTTCTQ 
TTTTTGTTGG 



21 
1 

TGAGCCTCTG 
CCAGACAGOG 
ACAGGC&3CT 
GTGGAOA3TC 
G06AGACCGG 
TCCCAGACCT 
TCACCTATTG 

TTQcccacac 

GCXX3GGAC36C 
TCOACGOQAA 
AOGOOCATTT 
GGTTTGGAAA 
ACTCrrOCCTQ 
ACTACGACAC 
GCAATGCTQA 
CCTGCACCAC 
ACCX3GGACAA 
ACTCGGCGGG 
GTACC2U3GGA 
6CX3ACAAGAA 
ATGAGTTCGG 
CTATGTACXG 
ACCTCTATGG 
CCAOGGCTCC 
CCACAGCTGG 
CTTCTAOGGC 
T06A0GCCAT 
GATTCTCTGA 
C0GC3GCTGCC 
TCTTCTCTGG 
TGOACAAGCT 

GOGaaftftGaT 

TOGTOaATCC 
CGCACS3ACGT 
GCGTGAGrrC 
TCCTGCAG!XG 
ACTGOGAGCA 
GAGOAAAQOa 
AGTGTTTCTA 



31 
1 

GC3tf3CCCCrG 
CCAGTCCACC 
GGCAGAGGAA 
GAAATCTCTQ 
TGAGCIGGAT 
GGGCAGATTC 
GATCCAAAAC 
CTTCGCACTQ 
AfiACATCGTC 
GGACQGGCTC 
CGACGAT6AC 
CGCAGATGGC 
CACCACDGAC 
O6A0GACGGG 
TGGSAA&CCX: 
GGAOGGTOGC 
GCTCTTCGGC 
GGAGCTGTGC 

OGGCCGaaaA 

GTGGGGCTTC 
CCACGGGCTG 
CTTCACT6AG 

TOcroGcccr 

CCOGAOSGTC 
CCCCACAGGT 
CACTACTGTG 
CGCGGAGATT 
GGGCAGGGGG 
CCGCMUSCTG 
GCGCCAGGTG 
GOGCCTGCIQiA 
GGTGCTGTTC 



dl 
1 

GTCCTGGTGC 
LrJ."l"GTGC TCTT 
TACCTGTACC 

AG06CCACQC 
CAAACCTTTG 

TACTCX3GAAG 
TGGAGGGCGa 
ATCCAGTTTG 
CTGGCACAOG 
GAGTTGTGGT 
GCGGOC2X3CC 



TTTOGCTTCX 
TGfXayGTTTC 
TCOGACGGCT 
TTCTGCC3CGA 
GTCTTCCCCT 
GATGGGCGCC 
TCCCOGGACC 
GGCTTAGATC 

GAACCIGAGC 
TGCOCCAOCG 

CCTTTGAGTC 
G6GAACC3U3C 
AGCOGGCOGC 
GACIOGGTCr 
TGGGTGTACA 



SI 

1 

TCCTGGl^T 
TCCCTGQAGA 
GCTATGGTTA 



AGGGGGGGGC 



CITCCAGTAC GSAGAGAAAG 
CCGGAGTGAG TTGAACCAGG 
GCXTGAGGAC TRGGGCTCOC 
ACCCTGGGGA AGGAGCCAGT 
AOTAOTGOAG GTGGGCTaOQ 
ATAAACXTG8 ATTCTCTAAC 



TGftAGGCCAT 
AGGGCaACCT 
ACTTGCCGCG 

TOAoacoacT 

GTGTOGOGGA 
CCTITCCTCfl 
CCCTGGGCAA 
ACTTCCCCTT 
AOaOCTTGCC 
G0C0CAGC9GA 
CATTCATCTT 
ACOGCTGGTG 
CCOGAGCTGA 
TCAC'J"i"i'CCT 
TCTGGTGCGC 
AAOGATACAG 
ATTCXn:CAGT 
TGCATAAGGA 
OVCGGOCTCC 
GAOCCXXXAC 
CiGGCCXSiRC 
OGGTGGACGA 
TGTATTT6TT 
AGGGCCCCTT 
TTGAGGS^fiCC 
CAGGCGOGTC 
CCX3U3GTaAC 
GCCTCTGGAG 
ACCGGATGTT 
CCTATTTCTG 
TG6ACCAAGT 
QTCCTGCTTT 
TTGOCGGAITA 
COCTCTCTTC 
CTTT 



Sag ID NO 5 C9X Sequence 

ihiclolc Acid Aeceealoa #s HH_00a213 

Cbding sequence s 188-5656 '~ 



11 
\ 



gogci;gocog cctggtccgc 

GCTGCAGCCC 
AGGTCCAGGA 
GCAGGGCCAC 
C7CTCTGGGA 
GTCOQIOTGS 
AACACCCAJS6 
AGGAGCTTCC 
TCXXIOCCAAG 
GTGTTTGAGC 
ATGIGOGATG 
CnSCTCACCA 
CS^CGGACA 
TOCTTCAAGA 
OGAGAGGGGA 
ACA6CTGTGT 
TCCAICOGAGT 
AGCCGCAACQ 
CAG6ACTACC 
ATCTTTGCTG 
GTCTCCXCAC 
GCCITCAATC 
COSACAGAGG 
CGGOGGGAAdS 



GCCCCGAQGT 
GAAGAOGATG 
CAGOSTCAGC 
GHCX36AGTGI 
CGSSCCKA'GC 
OGTCA1GGAG 
CAGCCAGATG 
TGAOCTGGAG 
CTCCAACTCC 
GGTCCTGAQC 
CAGCGTCCOG 
OCCCCCCTTC 
TAAACTGCAG 
CKTCCTGCAG 
GCTGGTCTTC 
TQGCATCATG 
GTACAGOACA 
CATCATOCCC 
CTATTTCCCT 



21 31 

I 1 
ACCCCCCAAC 
CATCTCCTAG OSGCAGCOCA 
CGGGGGCACA GCAGCAGCOG 
GCCXZCAOCCC ATGGGCCAGO 
OCTTGGCAAA COGCTGCAAG 



TCACAXCOGG 



06GAGGTGCT 
AAATCACAlQA 
GCCPGaaGGT 
CACrGQAOAa 
ATCTGGACAA 
GCGACTAOIC 
TQAGGCCTGA 
AOGTCATCAG 
TCrCAGGCAA 
GCAGGAGGQA 
CRfiCCTTCCA 
ATOAAOGOTG 
CGTCGGTGOC 
TCMXAACTA 
TGGGGGIGCT 
GGATGOGCTC 
TCAGCCCCIU^ 
TOGGTATATA 



OGCOGOGGGC 

aaaGAocx!AG 

OCGTCTGCGG 
GGOCQTGOAC 
OCTCAAGAAG 
TA^FFGGATTT 
GAAGCTGAAG 
CCTGACAGAA 
CCTGGATGCT 
C3VTTGGCEGG 
CXATGAGGCT 
CSCACCTGGAC 
CACOCTGGTG 
CTCCCATAGC 
GCAGGAGGAC 
CAACCXGGAC 
GATGTTCCAd 

ccaoGTGcns 



41 
1 

GBCC3CTOGGA 
OGOGCGQAGG 
AGGCIGGOOG 

cracrrocTQQ 
AAGOCOCCAG 
AOAGAQGAGA 
TSOCAGGBGO 
ATIGACACCA 
OCCOGTOAGG 
CTiXai'AiCATCC 



51 
\ 



GAGCGA6T0C 
GGAGAGGGAQ 
CJUSCCTHQAT 
TGAASAfsClG 



AGAGCATOGT 
CGCTGOGGCG 
AQC3GGCATTT 
TCATOGACTT 



GGCAA6TTTG 
GAGCCdGGCZ 
GATGTGGATG 
CCTGAGGGCG 
OGGC3GQGAlC3L 
GATGGOGOCA 
ACCftCGGGCA 



!lACTAOGAGA 
TCGtCCAACA 
ATCCGOGCCC 
AAGACGAGGA 
CfOOGGGOCC 



rrGGACAAAG7 
CCAACAGTGA 
AGOrrOCGGAA 
GCTTCQATGC 
GCACCCACCT 
ACX3TGCTOGC 
CCTACAOCCA 
CJCftAGCACAA 
AGCTTCACAC 
TGBTGGAGC? 
TAGACaSGCC 
CTGGGTCCTT 
TltAOCftOGT 



€0 

120 

100 

240 

300 

3 SO 

420 

480 

540 

600 

€€0 

730 

7ao 

B40 

900 

960 

1020 

1080 

1140 

1200 

12fi0 

1320 

13 BO 

1440 

1500 

1560 

1620 

1680 

1*740 

1800 

1860 

1920 

1980 

204Q 

2100 

2160 

2220 

22 BO 

2334 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

7 BO 

840 

90Q 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 
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GGATGGGAOQ CTIGGTGTQCC AGCTGGCOQA. GGAOCAGAAG GGCAACATCC ATCTGAAACC ISOO 

TTOCTTCXCC GAOGGGCTCA. AGATGGACGC GGGC31TCRTC TGTGAXOTCIT GCACCTQCGA 1560 

GCTGCAAAAA GAGGTGCX3GT CftGCTTCGCTS CAGCTTCRAC GGAGACTTOG TGTGCGGACA 1620 

GTGTGTffTGC AGCGAGQGCT GGAQTCQCCR GACCTGCAAC TGCTCCACCG GCTCTCTOAG 1G80 

J TGACATTCAG CCCTGCCTGC OG6?kGC3GCGA G6ACAAGCCG T<3CrC0GiGCC GTOGGSAGlG 1740 

CCAGTGCX3G6 CACTGTGTGT GCIAOC3QCGA AGGCCGCTAC GAGGGTCAGT TCTGCGAGTA 1600 

TGACAACTTC CAOTGTCCCC GCftCTTCOGQ GTrCCICTGC AAIGACCGAG GACGCTGCIC la 60 

CATG0G<2CRG TGTGTGTGTG AGCCTGGTTG GACAGGCCCA AGCTGTG2\CT GTCCCCTCAG 1920 

CAATCCCACC TCCATCGACA GCAATGGGGG CATCTGTAAT OOACJGTGGCC ACTtyrOAGTG 1980 

lU TGGOCGCTGC CACTGCCATC aBCftOTCGCX CTACACGG7VC T^OCftXClGOG iiGATCAACrA 2040 

CTCX3QCSGATC CACCXXiGGCC TCTGaSAGGA OCTAOGCTCC TGCGTGCAGT GCCAGGOGTG 2100 

GGGCAC066C GAGAAGAAG6 OGCGCACGTG TGAGGAATGC AACTTCAAGG TCAAGATGGT 2160 

GGACOAGCTT AAGAGAGCC6 AG6AGGTGGT GGTGOGCTGC TCCTTC3CGGG AOGAGGATGA 2220 

CGACTGCACC TACAGCTACA CCATGQAAGG TGACX3GCGCC CCTGGGOCCA ACAGCACTOT 2200 

15 CCXQGTGCAC AAGAAGAAG6 ACTGCCCTCC GGGCTCCTTC TGGTCJGCrCA TCOOCCTGCT 2340 

QCTCCTCCTC CTGCCGCTCC TGGCCCTGCT ACXGCTGCTA TGCTGGAAGT ACTOTGCCTG 2400 

CTGCAAGGCC TGCCTGGCAC TTCTCCOGTG CTGCAACCQA GGTCACATGQ TGGGCTTTAA 2460 

GGAAGACCAC TACATGCTGC GOGAGSUVCCX GATGGCCTCT GACCACTTGG ACACGCOCAl" 2530 

GCTGCGCAGC QGGAACCTCA AGGGCCGTGA CGTGGTCCXSC TGGAAQGTCA CCAACAACAT 2580 

jAj GCAGCGGCCT OGCTTTGCCA CTCA,TGCGGC CAGCATCAAC COCSU2AGAGC TGGTGCCCIA 2640 

OSGGCrrOTGC TT6GGCCTG6 CCGGCCTTT6 CACOGAGAAC CTGCTQAAOC CTGACACT06 2700 

GGAGTGCG<3C CAGCIGCGCC AGOAGGTGGA GGHGAACCXG AAOSAiOGTCT ACA06CAGAT 2760 

CTCCX3GTGTA CACAAGCTCC AGCAGACC3VA 6TTG0GGCA3 CAGCCCAATQ CC]GQC3AAAAA 2820 

GCAAGACCAC ACCATTGTGG ACACAGTGCT GATGGOGCCC CGCICXSGCCA AGCGGGGCCT 2880 

Zj GCTGAAGCXT ACnaAGAAaC AOGTGGAACA GAGOGCCTTC CACGACCTCA AtSGTOGCCCC 2940 

COGCTACTAC ACOCTCACTG CW3ACCAGGA CGOCCGGGGC ATGGTGGA6T TCCAGGAdCSGG 3000 

CGTGGAGCT6 GTOGACGIAC GGQTQCCCCT CTTTATCCX»3 CCIGAGGAOXS AClAOGAGAA 3060 

GCflGCTGCTQ GTQGAGGCCA TCGACGTGOC CGCAGGCACT QCCACCCTCQ GCCGCCGCCT 3120 

GGTAAACATC ACCATCATCA AGGAGCAAGC CAGAGAOSTG GTGTOCTTTG AGCAGCCTGA 3180 

3U GTTCTCQGTC AGCaSCOGGG ACX^AGGTGGC GOaCATCSCCT GTCATCOGGC OTGrCCTOQA 3240 

CGGOQGGAAG TCCXAGGTCT CCEACOGCAC ACAjGGAXGGC ACX^GOGCAGG GGAACOGQGA 3300 

CTACATCCCC GTGGAGGGTO AGCTGCTGn CCnGCCTGGG GAGGCCIGGA AAGAOCTGCA 3360 

QGTGAAGCTC CTQGAGCIGC AASAAGTTGA CTCCCTCCTG OQGGGOOGCC AGGTCCGCCG 3420 

TTTCCACGTC CAOCTCAGCA ACCCTAAGTT TOC3GQCCCAC CTGGGCCAGC OCCACTCCAC 3480 

3 J CACCATCATC ATCAGGGACC CAGATGAACT GGACCGGAGC TTCACGAGTC AGAT6IIGTC 3540 

ATC3kCAGCCA OCCCX:TCAGa GGQACCiaGQ GGCCXXSCAjS AACGCCAA30 CTAAGGGCXIC 3600 

TGGGTOCAGG AAGRTOC&TT TCAACTGGCT GOCOOCTTCT GGCAABOCAA TGGGGTACAG 3660 

GGTAAAGTAC TGGATTCAGG GQGACTCCX3A ATCGGAAGCC CACXrTGCTOG ACASCAJ^QGT 3720 

aCC3CTCAGTG GAGCTCACCA ACCTGTAOCC GTATTGOGAC TATGAGATGA AGGTGTGCGC 3780 

4ll CTAOGGGGCrC CAGGOC^GAOG OACCCTACAa CTCCCIGGTG TCCIGCCGCA CCCACCAGQA 3B40 

ASTSCCCAGC GAGOCftGGGC 6ICT6GCCIT CAAT6T0GTC TCCTCCAOGG TGACOCAGCI 3900 

GA6CT6GGCT GAGCSCGGCrCG AOAiCCAAOSS TGAlOATCACA GCCIAC3BAGG TCTBCTATGO 3960 

CCTOOTCAAC GATGACAACC GACCTATTGG GCCCATGAAG AAAGTGCTGG TTGACAACCC 4020 

TAAGAACC3GG ATGCTQCTTA TTGAGAACCTT TOQGQAOTCC CAGCOCTACC GCTACAGGGT 4aflO 

45 GAAGGOGOGC AACGGGGCGG GCIOGSGGCC T6AGCX3GGAG OGCATCATGA ACCTGGCCAC 4140 

CCAGCCCAAfi AOaCCCATGT CCATCCOGKT CATOCXTTQAC ATCCCXATQ8 TGGACOCQCA 4200 

GASCtSQSGAG GACTAOGACA GCnCCTCKl GTACAGOGAT GAGGTTCTAC GCTCTGCATC 4260 

GGGCAGOCAG l^GGCCCAGCT TCTOCOATGA CACTQOCTGC QOCTGOAAGT TfJOAGCCOOT 4320 

GCTOGOGQAG GAGCTGGAOC TGCGGCGCGT CAOGTQGOGG CTGCCOCOGG AGCXCATCOC 4380 

DU GOGCCTGTCG GCCAGCAGCG GGCX3CTCCIC COACGCDGAG GCCCSCCAiGGG CCOCXSCGGAC 4440 

QACGGCGGCG CGGOGGGGAA GGGCGGCAGC 06TG00C06C AGTGOGACAC GC33GGOCGOC 4500 

CGGAGAGCAC CIGGIGAAXG GGCXSGAXGGA CITTGCCTTC OOGGGGAGCA CCAACIGCCT 4560 

GCACAGGATG ACCACGACCA GTGCTGCTGC CTATGGC^^CC CACCTGAQCC CACACGTGCC 4620 

CCACCGCGTG CTAAGCA£AT CCTCC3WXCT CaCAGGGGAC TACRACTCAC TGAOCCGCTC 4680 

JJ AGAACACXCA CACIGGACCA CACTGCCC!AG GGACTACTCC JkCCCTCRCXTS CCGTCTCCEC 4740 

ccauxncTcr cgcctgactg ctggtgxgcc ogacaggccx: aoo ogo cigg tgttgtctgc 4boo 

eX»GGGGCCC ACATCTCTCA 6A6TGAGCTG GCAGOASGGG OQGTOCQAGC GGCQGCTGCA 4860 

GGGCTACAGT GTGGAGTACC AGCTGCXGAA GGGOGGTGAG CTGCATCGGC TCAACAXGOC 4920 

CAACCCTGCC CAQACCTCGa TGGTGGTGQA AGACCTCCTG CCCAA0C3WCT CCfEACGTGTT 4980 

OU CCGGGTaGGQ GCCX»6AGGC AGGAAGGCTG GGGCOGAIGAO OGTGABGGTG TCATCAOCAT 5040 

TGAATCXSCnG GXGCACCCGC AGAGCCCACX GtCTGOOCXG GCAOGCTCOG OCrcCACrET 53.00 

GavSCACrCCC AOTGCCCCMS GCCCGCTGGT OTTCACTGOC CTGABCCXnG ACTCSCTGCA 5X60 

GCTGAGCTGG GAGOSGCCAC GGAGGOCGAA TGGGGATATC GTOGGCTACC T6GTGAOCTG 5220 

TGAGATOGCC OU^GOAGQAO GGCCAGCCAC CGC::aTTCCG6 GTGGATGGAG ACAGCCCC3GA 5280 

Co GAGCCGGCTQ ACOGTGOOGG GOCTCM3C3GA GAA06T6CCC TACAAGTTC3V. AGGTGCAGGC 5340 

CAGGACCACT GAggSCTTOQ GGCXAfiftGOS C3SAGGGCATC AXCAGGKIAG AClOCCACGA 5400 

TOGftOGACCC TTCCOGCnSC TGOGCAGCOG TtSCOGGGCTC TTCCAGCIIOC OGCTG^MAG 5460 

OGAGTACAGC AGCATCACCSl CCACCC3^CAC CAGCGCCACC GAGGOCTXOC TAGTGGATGG 5520 

GCTGACCCTG GOGGCCCAGC ACCTQQAGaC AQGCSOtJCTCC CXCROCCBGC ATGTGAGCCA 5580 

/U OGAGTTTGTG AGCOaSACAC TGAGCACCA6 CX3QAACGCIT AGCACXrCACA TOGACCAACA 5 640 

GTTCTTCXnA. ACTTGAOOGX: AGCCTGC0C3C AOOOCGGCCA TGTCGChCXA GGOGTCCICC 5700 

OGACTGCTCT CCOSGABCCT CCTCAOCXAC TCCATCCTTG C3U2CGCrGGS GGCCCAGCCX: 5760 

AOXGCATGC ACAGA6CAGG GGCTAGGTGT CTCCTGGGAG GCATGAAGGG GQCAAGGTOC 5820 

GTCGTCTGTG GGCCCAAACC TATTTGTAAC CAAAQAGCTG GGAGCAGCAC AAGGACCCAG 5680 

/5 CCXTlGTTCr GCACTTAATA AATGGTTTTG CTACXGCTAA AAAAAAAAAA AAAAAAAAAA 5940 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAA 5994 



Seq ID NO; C92 mA Sequence 
i<Tixclelc Acid Accession ft; klM_023dl5 
Coding sequences 250..1326 

1 11 21 31 41 51 

I t I I 1 I 

OGGAaSAGGO TTTCGTTTTC ATGCITTACC AGAAAATCCA CTTCCCTGCC GAjOCTTAGTT 60 
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TCftAAGCTTA XTCTEAATTA GAGACAAGAA ACCTaXTTCA ACTTGAAGAC ACC6TATG71G 12 0 

GTGAATGGAC AGC3CRGCCAC CACAATGAAA GAAATCMAC C7VGGAATAAC CTATGCTGAA 180 

CCCACGCCTC AATCGTCCCC AAQTGTTTCC TGACACGCAT CTTTGCXTAC AJCSTQCATCAC 240 

AACTGAAGAA TGGGGTTCAA CTTGACGCTT GCSAAATTAC CAAATAACGA GCTGCAiCGGC 3O0 

2> CAAGAGAGTC ACAATTCAGG CAACAGGAGC GACGGGCX2A6 GAAAGAACAC CACCCTTCAC 260 

AATGAATTTQ ACACA&TTGT CTTGCOGGTG CTTTATCTCA TTATATTTGT GGCAAGCATC 420 

1:TGCTGAAT6 GTTTAGCAGT GTGGATCTTC TTCCaCATTA GGAATAAAAC CAGCTTCATA 480 

TTCTATCTCA AAAACATAiGT GGTTGCAGAC CTCATAATGA CGCTGACATT TCCATTTC3GA 540 

. ^ ATAGTCCATG ATGCAGQATT TGGACCTTGG TACTTCAAQT TTATTCTCTG CAGATACACT 600 

lU XCA3TTTTGT TTTATGCAAA CATGTATACT TCCATCGTGT TCXTTTGOGCT GATAAGCATT 660 

OATGGCTATC TOAACaaTGGT CAAGCCATTT GGGGACTCTC GGATGTACAG CATAACCTTC 720 

ACBAAGGTTT TATCICTTTG TGTTTGGGia ATCATGGCTQ TTTTQTCTTT OCCAAACATC 780 

ATCCTGACAA ATGQTCAGCC AACAGAGGAC AATATCCATG ACTGCTCAAA ACTTAAAAGT 840 

CCTTTGGGGG TCAAAOTOGCA TAjCGGCRGTC ACCTATaTQA ACAGCTGCTT GTTTGTQGC3C 900 

l-> GT6CIGGTGA ITCTQATCGG ATGTTACATA GCCATATCCA GGTACATCCA CAAATCCAGC 960 

A06CAATTCA TAAGXQ^TC AAGCCGAAAG COAAAACATA ACCAGAQCAT C3«3GGTTGTT 1020 

GTGGCIGTer TTTTTAOCTG CTTTCTACXyV TATCACTTGT GCAGAATTOC TTTTACTTIT 1080 

AGTCACTTAt3 ACAGGCTTTT AGATGAATCT GCaCAAAAAA TCCTATATTA CTOCAAAGAA 1140 

ATTACAcrrr tcitqtctgc qtgtaatgtt tgcctggatc caataattta crxTTTCaTQ laoo 

J,\) TGTAGGTCAT TnCAAGAAG GCTGITCAAA AAATGAAATA TCAGAACCAG GAGTGAAAGC 1260 

ATCAGATCAC TGCAAAGTGT GAGAAGATCG GAAGTTCGCA TATATIAIGA TTACACTQAT 1320 

STQTAGaC3CT TTTATT6TTT GTTGCSAATCG ATATGTACAA AGTOTAAATA AATGTTTCTT 1380 

TTCATTATCC l-TAAAAAAAA AA 1402 

25 9eq IP NO: C93 DNA Sequence 

Nucleic Add Accession fts 1IK_020789.1 
codlxig sequencer 20 8.. 3699 

1 11 21 31 41 51 

30 ] I I I 1 I 

GGCAGQAGGQ TGGA6006AG C66TGCG6AG CAiSATCrGGT GGTrCTCCGG AOAGCAGCTT 60 

CCTTOGGTGT TACATGAGCXT AAGOOCTCAC TGTACAGAAG AGTGAGAGCT GAAAOCOKSTT 120 

CCCTGAGCTO ATCAGAAGGA CATCCCTTGG OOCCTCCATC TGQGCTCCTG TQGATAGGAG 180 

^_ GGGCTGGGTG AGCAGGCCAG CTQQGCTATG GTGTGGTGCC TQOGCCXGGC OGTCCTCRGC 240 

CTGGTCATCA GCCAGGGGGC TGftOGGTCGA GGGAAGCCTG AGGTGGTATC GGTGGTGGGC 300 

CGGGCTGAGG AfiAGTG^TGGT GCl6GaCIDT GACCTGCTGC CCCCGGCXX3G GCGOCCCCCC 360 

CT6CAIGTCA TCGAGTGGCT GCX3CTTTG6A TTOCTGCTTC CCATCTTCSkT CX3WBTTOGGC 420 

CTCTACTCTC 0CG6AATTGA OCCTGATTAC GTGGQACGAG TCX3QGCTGCA GAAGOGGGCC 400 

TCTCXCCRGA TTGAGGGTCT CCGGGTGGAA GACCACGGCT GGTAC3GAGTG CCSOOGTGTTC 540 

**\J TTCCTGGACC AGCACAT^TC TGAAGAOaAT ITTGCTAAOG GCTCCTGGGT GCAICTGACA 600 

GTCAAXTCAC CCXXrTC AAT T OCAGGAGAGh GCTCCTGCTG TGTTGGAAGT GCAQGAACTG 660 

GAI3CCTGTGA CGCTGOtSTTG TGTGGCXIOBT GGCMCCSCCC TGCCTCATGT GAOGtTGGAAG 72 0 

CrCCXSAGGAA AaGACCTTGG CCAGGGCCftfi GGCCAGGTGC AAGTGCASAA CaOGACXSCTG 780 

CQOATOCGCC GGGTAGAGCG AGGCAQCTCT GGGQTCTACA CCTGCCAAGC CTCCAGCACr 840 

4D GAGGGCAGOS CCACCCAC6C CACCCAGCIG CTAGTGCTAS GACCCXXShST CAT06TGGTG 900 

CCX3CCCAAGA AC3hGCAC3lGrr CAATQCCTCC CAGGATGTTT GATTGGGCTQ CCATGCTGAS 950 

GCATACCCTG CEAACCTCAC CTACAGCTGG TIC9CA0GACA ACATOVATGT CTTCCACATT 1020 

AGCX3GCCTGC AGCCOCGG^ GCAaAT(3CTG GTGGAiCGGQA GCXTTGOGGCT GCTGGCCACC 1080 

- _ CRGOCTGATG ATGCCSGGCTG CTACACCTGr GTGCOCAGCA ATGGCXTGCT GCATCCA£X2C 1140 

jU TCAQCCTCXG CCTACCTCAC TGTGCTCIGC ATGCaaGGGQ TGATOCGCTG CCCGGTTCGT 1200 

QCCMCOCCC CACIGCTCTT TGTCAGCIG6 ACCAAGGftTG GUtfUSGOCXTT GC^AGCTGGAC 1260 

AAfiZTCCCTG GCTGGTCCCA GGGCAO^GAA GOCTCACIGA TCATCGCCSCT GGGGAAOGAG IB 20 

GATGC5CCTGG GAGAATACTC CTGCACCCXK TACWWaGTC TTGGTACOGC OGGGCCCTCT 1380 

_ » CCTGTGACXX: GCXJTGCTGCT CAAGGCTOGC OCRGCTTTTA lASAQaaoCx: C3UM3GAAGAA 1440 

ZATTTCCAAG AAGXAGGGCG GGAOCTGCTC ATCCCCTGCX OCGOCCAAGG GGACOCICCr 1500 

CCTGTirGTCT CIIGQACCAA GGTGGG006G GGGCXGCMUS GCCAGGCOCSA GG13GGACAGC 1560 

AACAGCnSCC TCHaXSCrOCa ACCATTGACC AAGG«3G0CC AOGG6CACTG GGAATGCAGT 162Q 

GCCAGCAATG CTGTGGOCXX3 AGTGGCCACX: TCCACGAAOS XCIACGTGCT GGGCACTAGC 1680 

CCTCaTCTTG TCAOCAAXGT GTOCaTGGTQ GCTTTQCCX^A AGGGTGCCAA TGTCTCCTGG 1740 

OU GRGOCTGGCT TTGATGOTGa TTATCTGCA© AGATTCRGTG TCTGSTACAC CGCACTGGCC 1800 

AAGCBTCCTQ ACOGAATGCA OCATGAGIG6 GOTGrrCClTGG CSU3TQCCI6T GSBOGCTCCT 1860 

CACCTCCIAS TSOCAOGGCT GCAOCX3CC3U;! ACCCAGTACC AQTTCAGOGT GCTAGCICAG 1920 

AACAAGCTGQ OGAarGGTOC CTTCAGOGAA ATOGTCTIGT CTGCTCXSSGA AGGGCTTCCT 1980 

ACCAOSCCAG CTGCACCCG6 6CTTCCCCCA ACAGAGATAC OGCCTCXZCCr QTCOCXTTCCG 2040 

OD CGGGGTCTGG XGGCAOTGAG GACACOCGGG OGGGTACTCC TGCATTGGGA TCCOCCAG2U5 2100 

CIGQTGCCXA AGA6ACTGGA XGGCXAOSaCC TIGGAAGGCC GGCSUkSGCIC GCAGGGCTOG 2160 

GAGGTGCTGG ACCaaOCTGT OGCAOQCACA GAAACAOAQC TQCIGGTGCC AGGCCTCATC 2220 

AAGGATOTTC TCTAOGAGTT CCGCCTOGTG GCCTTCGOGG GCAGCrTOGT CAGGGAEC3C3C 2280 

„^ JM3CAACA0SG CCftACGTCTC CACTTCCaOT CTGOAGGTCr ACCX:TTC!GCQ CACGCAGCTG 2340 

7U COGGQCCTCC TGCXTTCAGCC OSTGCIGGCC GGCGTGGTGG GOGGAGTCXG CXTXCTGGGA 240O 

GTOGCXXvXOC IWrCAGCAT GCTGGCOQSC TBC C TOCTGA. ACGGGOOC9U3 GGCTGCCGGC 2460 

CGCC3SCCGCA AQ08CCXCCG CCAAGAXOCA CX:TCTTATCT TCTCTGCX3AC OSGGAAOTCA 2520 

GCTTGCACCCT CTGCXCTGGG CTCAGGOlGX CCTGACAGCG TGGCGAAGCT QAAGCTOCAG 2580 

- GGATCCCCAG TCOCCftGCCT GCGCCAGAGT CTGCTCTGGG GGGATCCTGC CGGAACTCCC 2640 

/D AGOCCCCACC OQGAXCCXCC ATCTAGCGGG GGAOCCXTAC CICTGGAGCC CAXTTGCOOG 2700 

GGOOCAOAOG GGOGCTTTGT CATGGGGGOC ACTSXaaCGQ CCXSCOCAGOA AAGGXCAGGC 2760 

tX3GiaR£3C3U36 CAlSAACCrCG GACXCOU30C CIVGOGTCTGG CCCCSGXOCTT TGACXGTAGC 2820 

AGCAGCAGCC OTWJTGGOOC ACXXCTGCCC CICTGCATTG AAGACATCAG CCCTGTGGCA 2880 

CCCOCrCCAG CAGCCCCAGC CAGTOCClTG OGAOaTaCTQ GACCSDCTQCT CCAGTACCTTG 2940 

oU AGCCTGCCCT TCXTCC3GAGA GATGAATGTG GATGGOGACT GGCC3COC3GCr TGAGGAGCCC 3000 
AGCGCIGCXG CACCCCCAOA TTACATGGAT ACCOOGOGCT GTCCCACCTC ATCrTTCCTT 3060 
CGTTCTCX3U3 AAACCCCTCC TfiXATCCCCC AGGGAATCAC TXCCTGGK36C TGXOGXAGGG 3120 
aCTGGGGOCA CTGCBOAGCC CCCXTACACA GC0CTG6CTG ACIGGACACT GAGGGAGGOa 3180 
CXGCTGCCAG GCCTTCTCCC TOCT8CCCCT CXSAGGCftSOC TCACCAGCCA GAGCAGCGGG 3240 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



OGAGGCAGCQ 
CTCAGCC3CTG 
AGGGAiGCATQ 
TGOGACTCAG 
AGCICCCGGC 
TGOCTCCTGA 
BAATTCCTGG 
CAGCCAC5TCC 
GAAAAGOCAT 
GGCACTGCCC 
CUCCTTGGTA 
TCTTAG6TCT 
GTOTGTGAAG 



CTTOQTTCCT 
CTCCAGGAGA 
TGGTGRCAGT 
AATTCCCTGG 
TCAGACCTOa 
ACACTGCCCA 
CCTTCCGCOB 
CCCACCCCGA 
ATGGACCTOC 
CTGCCTGCCC 
rrGGAATBTAT 
GOCTCTCTCT 
Tl'i"iTTACAi6 



OCOX3CCCCCC 
CACCAGCAQC 
CAQCAAGAGG 
GGACATGQAA 
AI3CIGA6ACA. 
TGTTACTQQC 
COGCCGAGAT 
ACRGGCCACT 
AAAGGA06CC 
CTTTGGTGCC 
GTGCIGAjCC3C 
CTCTCTCTCr 
GTQAATAAAC 



TCCACIbCSOCC 
TGQGOCA6TG 
AGGAACACAT 
TTGCTGGAGA 
GAGCTAGGTQ 
CCTGAGGCCC 
GCTACTAGGQ 
CTGCTGTGAA 
CCCAACCA6A 
CAGGCACAGA 
CCTAOGTBAG 
CTCTCTCTCT 
AAAQTTTOAA 



Seq ID ISO c C94 DMA Sequeac^ 
Kiicleic Acid Accession #-. »M_0D6875 
Coding sequencei 186^.1X90 



1 
I 

GAATTOOGCA 
CACCAQTTTC 
CCCGGGOGTC 
CCTCCATOTT 
CGCCAIQGAGG 
GTAAGGGGGG 
CCATCAAAGT 
CATGCCCACr 
TGATCGGGCr 
CTTTGCCCGC 
CAAGCSCGCTG 
TTGTCCMCG 
AACTCATTGA 
OC^ACAAOOGX 
CCACTSTCTQ 
AQAGGGAGC& 
GCTGTGCX:CT 
AGATCCTGCT 
AAAGGAQGGC 



G'STGRCTCBS 
ATTCAGGATC 
GAAAGGAGCC 
CTCATTXIGC 
TTATTTTOAT 
TCATATGCl'T 
AGXAAAGOQA 
TCftOCCCAGG 
TTITITTTTT 
CTGOTQAGAA 
CAOCACCAGA. 
GCTTQCTQTT 
CTGAGCOGGG 
TCCAAfiTGTG 
CCACXAHTTTA 



11 
I 

TCT6CTTTCC 
CACGCCCTQC 
GACCAAGCCT 
CAAGGATCGG 
CTTTOGCACC 
GATT0CXXX30 
CGAAGTCQCA 
GCTTGACTGG 
CCAGGATCTC 
CTTCTTTOGC 
TGACATCAAG 
TTTTGGTTCT 
GTACAGCCCC 
GTCACTGGGC 
GGAGATTCTG 
AATCCGOCQG 
GGACCCCXX3Q 
CTGCCCCTTT 

TGGTcrurauus 

TTTTACAQGT 
ACrOGGTTAGA 
TTCCTCCCAG 
TAAGOAASTT 
GATGTGXCAC 
TTACTTQQGIC 

ATTTTTTATT 
TTTTTTTTTG 

gaaccitaat 
caataggatg 

TGTTTTCCTG 
ATTQTCCAAT 
CCCTCCTTTT 
ATAAASarAA 



21 
1 

GCGAATCTCA 
ACCCTGGCX3C 
GGQCTTAGCG 
CTACAGOGGC 
GAM3CQTTCG 
GTCTTGGCA6 
AATCGTGTGC 
CTGCTATGGA 
TTTGAGACAC 
TTTGIftCTATA 
CAAGTA6TGG 
GATGAGAACA 
GG*EGCCCT<5C 
CCAOAGTGGA 
ATOCTCCICX 
GAAGCTOAaC 
TQOCTGGCSrC 
AT6CAAACAC 
GGCCTGGTCC 
AQCCATCCCA 
CATTACCAGT 
AGACATAAAC 
AACCTGTGGT 
TATTXTQGTG 
CXXRCKTIGQ 
AAGGGTGCTT 
TAGCCTAGGG 
TTGGGGGAGQ 
GGT6AGGGGA 
TCCATAATTT 
GGATGGATGG 
GGQCGCTCCC 
TACTAAAATG 
TTTTCCTGCC 
TAQAATCftQA 



31 
I 

ACGCTGCGCC 

CXX:CCAGCCC 
GQTTCAOTGG 
CTCCCGOGCC 



GACAC06CCT 
TGGGCTGGTC 
AAGTGGGTGC 
AGGAAOGCrr 
TCAOUSAQAA 
CAGOCKTGCA 
TCCTGATAGA 
TTCA1X5ATGA 
TCTCTGOACA 
ATGhCATGGOr 
TC3CACTTCCC 
CCAAACCTTC 
CAQCCGAGQA 
TTGCTAOCCT 
TQGC^TGTC 
CATTAAAGTC 
CAAGTTTGCC 
CCCTGA'lT'ri* 
AAG7T6TTCC 
CACCrCCTAC 
TOCTTOCAAT 
TCCCATATTG 
TAATQCXCTG 
CCCEACTTTG 



^CCAATTTTQ 
TAAATAATCA 
TGGA^IAIIT 
AAAAAAAAAA 



CCTCTGCAGG 
GCCCT6AGAO 
CTGTGGACGA 
CTTTGCA.CCT 
TGAAGACTCC 



CTCGGCTACC 
CATCCCTAAT 
CAQACCTAGT 
CCCTGATAGX 
TC7GGGGATT 
CTCTCTCT6I 
AGAAAAAAAA 



41 
I 

6TCTGCGGGC 
7GGCTCX:CCZA 
GCTCAATCTG 
CCCCGGGACC 
TCGACTOOGC 
CACAGATCGA 
CCCCTTQTCA 
AGGTGGTGGG 
QyTGCTGOTC 



GCACTGCCAT 
CCTACGCOGT 
ACXSTTACACT 
OCAGTACCAT 
GnSTGGGGIAC 
AGGCCATGTC 
TTCCOGACCC 
TGTTAOCCCT 
AAGCCTGGCC 
ACAGGGATAG 
QUSTATIACT 
CAGTTCOCTT 
GGAGGOGGAA 
CATTTTGAGC 
TACCACCACA 
AOOCCAGXAG 
GGTCAAGCTG 
TTGTTACCCC 
TTATGOQUUS 
GGAAGATGGA. 
GATGOQCTAG 
CAGAT7TTTG 
GGTATTGTGG 
AAAAAGCCAT 
AAAAAAAA 



AGGCAGCTAC 
ATGGCCCOGA 
GAACTATGAO 
GGGCTTGGCC 
AGAGOAGGGC 
fX!TTGGG(SAG 
A6CCTATOGA 
GTGAGGCTGT 
TTCAAAC6AG 
OGGTTTGGGT 
GGAAGAGGGA 
GTGTOTGTGT 
AAAAAAAAAA 



51 
I 

GCTrGCXJOGC 
GCTGCGCTGC 
CGCAGCGCCA 
CSrCAOGCCGC 
GOCCfOCTGG 
CrCCAGGTGG 
GACTCAGTCA 
CACCCTGOCQ 
CTCGAGCG6C 
GGTGAAGGCC 
TGCGOrGGAG 
GGCTGTGOCA 
(5ACTTTGATG 
GCACICCCGG 
ATTOCCTTTG 
TCGOCAGnCT 
TCACTGGAAG 

CAAOCXxnrcc 

TGGCCTGGCC 
ATGGACATTT 
AAGGTAAGG6 
COCAATCCXA 
CTTCTTGCTT 
CCCGOQAGIC 
OlAACTTAGT 
CTTTTATTTT 
CWACCTGOC 
AAGGCTTCIT 

TQLrixrri'ATT 



Seq ZD NO: C95 UNA Sequence 

Nucleic Acid AccoBsion NM_002S10.1 

Coding sequence; 92.. 1774 



1 
I 

CAGATGOCAG 
CXrTTQAGTGC 

CAATGAAAGA 
TGBWUVTSAC 
AAACTC3CTGG 
GGGCTCAAAT 
CAATG6GAAC 
ATAT6ITTAC 
AAGGCATCAT 
ATGGAATTTC 
TrCAGTGAGA 
GACTGTCTAC 
COTGGTAACA 
ATCOGACOAA 
TAGCCACTTC 
CCTGTTTGTT 
CCTTAACCTC 
CAGACCTICA 
TAGGATTCCr 
AATTQTAGAG 



11 
I 

AAGAACACTG 
CTGCGTCCGT 
GCXGCAAOAT 
CCTTCTGCrt 
TGGAATGAAA 
AAGGGAGGCC 
ATAACA'i"lT6 
ASCAGTCTKXQ 
AACTOGACSVQ 
AAC3GTCTTCC 
ATCTACGTCTF 
6TTTCTGTGA. 
AGAAGACATG 
GRTCAGATTC 
AOCTTOCTCR 
CTCAATTATT 
TCCACCAATC 
ACTGTGAAAQ 
AAACCCACCC 
GATOAAAACT 
GGAATCTTAG 



21 

! 

TTGCTCTTGG 
GAGAATTX»G 
TGCCACTTGA 



AACTCTAOCC 
GTQTGC3\GGC 
OGGTGAACCr 
AQAAGAACra 
CATGG'fCRGA 
CTGATGGGAA 
TCCAjCAC&CT 
ACACAGGCAA 
GA0GS6CAXA 

ctqtqt^tot 

AAGATCTCCC 
CTACCaVTTAA 
ATACTGTGAA 
CTQCA6CAOC 
CTTCITTAGG 
GCCAGATTAA 
AGGTTAACAT 



Bl 
1 

tCGGAOGGGCC 
OVTGGAATOT 
TGOCGCCAAA 
OCACAATCftA 
AGIGTGGAAG 
GQTCCTGACC 
6ATATTCGCT 
CAGAAATGAG 
GGACA6VGAC 
ACCTTTTCCT 
TG6TCRGTAT 
TGTGACACTT 
TGTTOCCA1X2 
GACTATGTTC 
CATTATGT XT 
CTACAAGTOQ 
TCACAOSTAT 
AGGACCXTGT 
AOCTGCTGGT 
CAGATATGGC 
CATCCAGATG 



41 
I 

CA6AG6AATT 
CTCiaCTATT 
OGATTTCATG 
TTAAAlVjOCX 
OSOGGAGACA 
AGTGACTCAC 
AGAOXacCAAA 
GCTGGTTTAr 
OGGGAAAATG 
CAOCAOCCOG 
TTCC3WSAAAT 
GGGCCTCAAC 
GCACAAGTOA 
CAGAAGAACG 
GATGTCCIGA 
AGCTTOGGGG 
GIGCTCAATG 



GOGAAATAAG 
CRACXTTCCTC 
GGAGGGOAGT 
OTGTGGAAAC 



51 
I 

CASAGTTAAA 
TCCTGGGA3T 
ATOTGCTGGG 
OGTCITCIGA 
TQAGGrGGAA 
CAGCGCrCGT 
AGGAAGAT6C 
CTGCiaATCC 
GCACCGGCCA 
6ATGGAGAAG 
TGC3GACGATG 
TCATGGAAGT 
AAGAT8TGTA 
ATC9GAAATTC 
TTCATGATCX: 
ATAATACTGG 
GAACCTTCAG 
CACCAOCACC 



3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3640 
BdOO 
39€0 
4020 
4024 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

B40 

BOO 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

16 BO 

1740 

IBOO 

1860 

1920 

1980 

2040 

2036 



GACAACCGOC 
CACTTTCRAG 

jkCMSRCsarcc 



TGATGCOGGT 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

€60 

720 

780 

840 

900 

960 

1030 

loao 

1140 
1200 
1260 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



CCCAT06CCT 
OGAGGTCIGT 
CCGTGTGGAlr 
GAOGXACTGT 
GATTTCT6TT 
CrCCfiTTGQC 
CAAGGAATAC 
TGTCTTTCTC 
ACTCAAAJyVC 
TTTTCAGTCC 

TTTAOftQATG 
AAAGCAACTT 
TAACTAGTAG 
AGOCTAACCC 
ATGC3kTAAAG 
TTCAATACAC 

gtgcrcactt 

TGACAACCTA 
GGACA.TTTAG 

AQATOAGQTC 
AGTAAAhCCA 
AAC^rQTGGOA 



GAAAGCTCC3C 
AOCATCftTTT 
CTC3f3ATGAGA 
GTGAACCTCA 
CCTQACAfaAG 
TGClfTGGCCA 
AACCCAATAS 
AAOOGTGCAA 
CAAGAATTTA 
CATXGATGTG 
ATAGATATTG 
GGGASAGGGA 
AGCAAGGCTT 
GATAGAAACA 
AGGTTAACTS 
CCKAT6TAGT 
ACTCATGAAC 
GCTAGACrCA 
CTTTGCTTGQ 
TTAGTCCm 
TTTTQTATAG 
OCTGGTTTTT 
TCTACTATAT 
AGAOACAAAA 



TAATAGACTT 
CTGACCCCAC 

TGTGTCTGCT 

ACCXIAGCCTC 
TATTTGTCAC 
AAAATAGTCC 
AASCCGTGTT 
AAGGAGTTTC 
AGAT6TGCT6 
TQGTTTGGaa 
TTATACTGCA 
CT TTTCATTA 
CTGTGTCCOG 
CAAOAAOAGG 
CCAGTTTCTA 
TCCTGATGGA 
GAAAAAATAC 
CTOAGTGAAO 
TTATATACCA 
TCGCiaCACA 
CATGGCAACT 
GTKAGACAT6 



T6TG6TGACC 
CTGOWGATC 
GACTGTGAGA 
TGACACAASC 
GCCT:irTAAO& 
TGT6ATCTCC 
TGGGAATGTG 
CTTOCCOGGA 
TTAAATTTCG 
GAGTGGCTAT 
AABTTGAATT 
OGCA6CIXCA 
TTTTTTATGT 
AGA6TAAGGA 
CGGQATACTT 
AQATCKT6TT 
ACAATAACA6 
TACTCTCATA 
GAATQATATT 
GGCATGATGC 
TATTTQAAAT 
TGATCAGTAA 
AjCATTCTTTT 



7GCCAASGGA 
ACCCAGAACa 

CGAACCTTCA 
CTGGCTCTCA 
AIGGCAAACA 
^TCTTGQTQT 

gtcaqaagca 
aacx:aggaaa 

ACXTTGTTTC 
TAACCTTTTT 
TnTATAiGQT 
GCCATGTTGT 
TTCACTTATA 
GAGAAGCTAC 
TCRGCTTTCC 
OCAAGCTAAC 
GOCCAAGCCr 
AATGQGTGGG 
CATATATTCA 
7GA6TGACAC 
CATATATTAA 
GGATTTCACC 
TCTCTCCXTC 



Geq ID NOs C9fi nUA Sequence 

Nticaeic Acid Accession #t Bos sequence 

Coding sequence; 1-.4247 



1 
I 

ATGCGAATOC 
TGC5GCTAATQ 
ACAGGAGCAC 
CAATCTCCTA 
AAATTTCAGG 
ACAGTOGAAA 
TTTAAAGCAA 
GAGCATA6TT 
QACCGATTTT 

GAAAOfTGTTA 
CTGCCAAACT 
ACAOACACAG 
GCTGTTTTTT 
TTACAAAACA 
ACZGGAAAGG 
GACCCAGAGA 
ACCATGATTG 
OATGAATTTT 

TTAATTGGAA 
GCTAT!rQTaA 
ATTTCTAOCA 
OGiKTOCCCAA 
AATTCCRCTT 
CAGACTGTSA 
OaCTCTAAAA 
AATACAGTTT 
ACIGGAOCTG 
OAOAACATAT 
GTOCTTATAC 
GAATCACTAA 
ACAGCACAGC 
AX&GGEGTTG 
CAGGCTCCCT 
ACTGAGGXAA 
GTCAAOGTGG 
OlTGASTCrC 
CrTGTGATO& 



GTTATATCCA 
ATAAAGCACT 
TTTGAGGAAG 
CCAGACAACA 
AAGCTAGCAC 
GTTtSAIGGCr 
□CTGAAQATT 
AACCTCGTGG 



AOSAATT7TA 
GGAiGSTGTOa 
TCOCTBCCAG 



11 
I 

TAAAGCGTTT 

GATACTACAG 

TGAATCAAAA 

TCAATATTQA 

6TTG0GATAA 

TTAATCTCAC 

GCAAGATAAC 

TAQAAGOAiCA 

CAAGTTTIGA 

AfaCTTOOGAC 

GTOGTTTTGG 

CAACTSACAA 

TTGACTGGAT 

GTQAAGTTCT 

ATmCGAGA 

AAGAGATTCA 

AITATACXIAG 

AGAAGTTTGC 

TGACAGATGG 

AIGTTCTTCA 

TGATTGTCGA 

CIGAAGAAAT 

ATCCTG3TAG 

CAACACACXA 

CAAOAGGAAG 

CGCRAOCSVGT 

CTOAACTOGC 

CTGTTCTTAG 

CXATAACAGA 

AAGATTCTTC 

GCCUGGGTA 

CAC3AATCTGC 

AGGATCCTTC 

COGATOTTGQ 

ATOAATCTGA 

CAlGTIACAGA 

CACCTCATGC 

TATACXCGCA 

GTATTGGTCX 

OXaTCAGCOCI 

AATGCTTCCA 

CACCTC3CAAC 

TTCCAAAGCA 

TGCAOAGCTG 

AGCACAAGAA 

AGCTTGCTGA 

ACAACAGACC 

TC:rGGAaAAT 

AGAAAGGAAG 

ACTTTCreGT 

CTCTAAGIAAA 

TCACACAGTA 

TOCIQACCTT 



21 

! 

CCTCGCTTGC 
ACRACAGAGA 
AAAOTTGGOGA 
TGAASATCTT 
AACATCATTG 
TAATQACTAC 
TTTTCACTGG 
AAAATTTCCA 
GGAAGCASTC 
AGAAI9AAAAT 
GAAGCAGGCT 
GTATTACATT 
TGTTTTTAAA 
TACAATGCAA 
GCAAQUffCAC 
TGAAGCAGTT 
CCTTCTTGTT 
AQTTTTQTAC 
CTATCAAGAC 



21 

i 

ATTCAGCTCC 
AAACTTGTTG 
AAGAAATATC 
ACACAAiOTAA 
GftAAACACKT 
CSQTGTCAGOQ 
GGAAAATGCA 
CTTGAOATGC 
AAAGQAAAAG 
TTGGATTTCA 
GCTTTAGATC 
TACAATGGCT 
GATACAGTTA 
CAATCTGGTT 
AAGTTCTCXA 
T6TAGTTCAG 
ACATGGGAAA 
CAQCAGTTOa 
TTGGGTGCTA 



41 
1 

TCTGTGTTTS 
AacnGATTGG 
C3iACATGTAA 
AXOTGAATCT 
TCATTCAIAA 
GAGGAGTTTC 
ATATGTCATC 
AAATCTACTG 
GGAA6TTAAG 
AASOBATTKT 
CATTCATACT 
CATTGACATC 
GCATCTCTGA 
ATGTCATGCTT 
GACAGGXGTT 
AAOCnGAAAA 
GAOCIOGAGT 
ATGOAOAGQA 
TTCTCAATAA 



CATGCCTACT 
AATCA2U6GAG 
AGACASTGCT 
CAATCGCATA 



GATAATGCT6 
GA6GAA8AGG 
ACAAACCAAA 
GGGAGGAAAT 



CACIAAATTA 
ACCTCACRCr 
ATCTCCACAT 
ATAIQAGGA3 
AGGCTCCAGT 
TATAXTITCC 
TAOAAATGCT 
TATQSAQGGA 
ATCSUSOCAGA 



TCTQOAAATG 
TTTTAGOOCA 
GA(2AACCCAA 
AGC7GAGGGG 
GAC^TTATC 
GACTGCAGAC 
AOCTATCTTT 
TGTTGCAGAT 
TACIVJITQAC 
IGGATACATA 
AAAGGATGGC 
AAAAGCTTAT 
GATATGOGAA 
GAGAAAATGT 
CACTCAGAAO 
CACAAAAATA 
TGACTACACQ 
IGTGAGAAAG 



GCXIIACAGAAA 
aTGQAAOGTA 
ATGAACTTGT 
GAQAGTTTAT 
CCOGKT^ACTT 
TCGGAAAACC 
'TCCGAAGATT 
AATGTGTGGT 
GABAOCTTTC 
AAOTCCTTTT 
CCACKTCKUT 
TCCrCCftGAC 
OGGGTATACA 
TTGGAATCGG 
TQTCXn^IGG 
TTTTACITRG 
CCAATTTCAG 
TTACATGCAA 
TTAaOTATTA 
AATAIGGXTG 
AAACXOACTG 
ATTCCTGCCC 
CATAATGTGG 
GATCAGTACT 
AGTGT6CAAG 
AAAAAGGGCr 
CABTGGCCTG 
GCAGCCXATO 



AACTTGATCT 
GAAAAGACAT 
TCAGGAAAAA 
ACAft:rGAAGC 
ATGTTCCSaiA 
AAGAIAXT^rC 
CTTCRGCCrC 
GGGGGAjCTGC 
TGACCASTTT 
CTQCTATCCC 
CAGAGACAAT 
CAACnXZATC 
TTCCTAGCTC 
TCCAGACTAA 
CT6CASGCCC 

AACAGGATTT 
ATGAGGCCAG 
AGAAGAAGGG 
TTCTTQTOGG 
AGGACAGIAC 
ATGATGTCGO 
6TAGTGGGTT 
CAGCAOACAG 
CCTAT6ATCA 
ATTATATCIkA 
AAGGOCCACT 
AAGTTATTGT 
GGCCTGCOGA 
TGCriXSCCTA 
CCCAGAAAGG 
ACATOGGAGT 
CCAAQGGCXA 



GCATTCCCAC 
CftGTCTGCAG 
ATGGGTCTGG 
CGAGCACCCT 
GTGCCCTGAT 
ACAAAAAACA 
AAGGCCTGAG 
AGGATCCGCT 
TGAAGCTCAC 
TTOCTAAAGA 
TAAATOTCAT 
GAAACIGATA 
AAQTCTTAGG 
TATTGATTAG 
ATGTAACTCT 
TGAAXCCCAC 
GTOGTATGAT 
AGTATTTTGG 
TTTATTCOVT 
XCTTGTGTAT 
GACTTTCCAA 
TCrGTTrGTA 
CIGAAAAATA 



51 
I 

CCGCCTGGAT 
CTGGTCCTAT 
TAGCCCAAAA 
T^AAGAAACTT 
CAClGGGaUlA 
AGAAATGGTG 
1GATGGATC3V 
CTTXGAT6CA 
AGCTTTATGC 
TGAXGGA6TC 
GTTGAACCZT 
TCCTCCCTGC 
AAGCCAI5ITG 
GATGGACTAC 
TTCCrCKTAC 
TGTTCAGGCT 
OGTTTATQAT 
CCAAACCAAG 
TXtGCZAOCC 
TaOAAAATAC 
TTTCCCTGAA 
TQAAQAAGGC 
GGAACCCCAG 
GAAOACTAAC 
TACATCTTTA 
CTTGA^TTCT 
TTTAAAT6AT 
AGAATOCTEA 
C3UU3CTTGAT 
ATTCATCTCT 
AACATATGAT 
AGGTTCMGAA 
TACAGACATA 
TTACACTGAG 
A6TGATGTCA 
CTACTTOCCA 
G6TCTGCA£» 
TAATAGTAOC 
AGTTATACCC 
TATTCTCATG 
AfCCCSCTAOA 
AGC3UVTTCCA 
a:ACTGAAGAA 
CTCCAACCAC 
IAGC2UQGGIT 
TGOCAATTAT 
GAAATCCACA 
CAT6ATAACA 
TGG6AGTGA6 
rCATACTGiG 
AAQAGOCAGT 
ACCAGAGTAC 
TGCAGTGGGG 



1320 
1300 
1440 
ISOO 
1560 
1620 
1680 
174C 
IBDO 
1B6D 
1920 
19B0 
2040 
2100 
2160 
2220 
2280 
23A0 
24O0 
2460 
2520 
2580 
2640 
2669 



60 

120 

ISO 

240 

300 

360 

42D 

480 

540 

600 

660 

720 

780 

840 

90D 

560 

102D 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

ISOO 

1560 

1620 

16B0 

1740 

ISOO 

1860 

1920 

1980 

2040 

21D0 

2160 

2220 

2260 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

28B0 

2940 

3000 

3060 

3120 

318D 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



AGXATGTTGC 
ATCCGTTCAC 
ACACTGGTTG 
TATGTTAATO 
CAGCTCCTGA 
AACAGOGAAA. 
TCATCCCTGA 
CAGAGCAATG 
AsGGATGATAT 
GC3U3AA6AT6 
AAGQTCACTC 
CAGGACTTTA 
TGTOCXARAT 
ATAAAAGAAG 
GXGACaaCAG 
TCOGTGQATG 
GACATTGAGC 
BAAGAGftATC 
GCTOAGAGCT 



TCCACTGCAO 
AGCAGATTCA 
AAAGAAATTA 
AGGCCRTACT 
CACTCCTCAT 
GOCAGTCAAA 
AGAATCGAAC 
GT6GASAAGG 
AATTCATCAT 
OGGACXZATAA 
AATTTGTITA 
TTATGGCmQA 
TCTTAGAAGC 
GGCCAAATCC 
AAGCXGCCAA 
GAACTTTCTS 
TTTACCAGGT 
AGTATCAiSTT 
CATOCACCTC 
TAOAiQTCrTT 



TGCTGGAGTT 
ACAOGAAGGA 
TTTGGTACAA 
TAGTAAASAA 
TCCTGGftCCA 
TATACAGCAG 
TTCTTCTATC 
CACAGACTAC 
TACCCAGCAC 
TGCCCAACTG 
CTGGCCAAAT 
A6AACACAAA 
TACACAQ6AT 
AOATAGCCCX: 
TAOGG&TGG6 
TaCTCTGACA 
AGCCAAGATQ 
TCTCTACAAA 
TCTCGACAGT 
A0TTTAA 



GGAAGAACAG 
ACTOTCAACA 
ACTGASGAGC 
ACTGAGGTGC 
GCAGGCAAAA 
A8TGACTATT 
ATCCCTGTOG 
ATCAATGCXrr 
CCTCTCCTTC 
GTGGTTATGA 
AAAGATGAGC 
TGTCTATCTA 
GATTATGTAC 
ATTAGTAAAA 
CCTATGATTG 
ACCCITATC3C 
ATCAATCTGA 
GIGATCCTCA 
AA1K?l?T0CnQ 



GCACATATAT 
TATTTGGCTT 
AATATOTCTT 

tsgacagtca 
caaagctaga 
ctgcagccct 
aaagatcaaq 

cctatatcat 
ataccatcaa 
ttcctgatg6 

CTATAAATTG 

ausaggaaaa 

TTGAAGTGAG 
CTTTTGRACT 
TTCATGATGA 
ACCAACTAGA 
TGAGGOCAiGG 
GCCTTGTGAG 
CATTGGCTGA 



TGIGCTAGAC 
CTTAAAACAC 
CATTCATGAT 
TATTCATGCC 
GAAACAATTC 
AAACCAATGC 
GGXTOGOVXT 
GGGCTATTAC 
atJATTTCTrGG 
CCAAAACAT6 
TGAGAGCTTT 
ACTTATAATT 
GCACTTTCAG 
TATAAGTOTT 
GCATCGAOGA 
AAAAGAAAAT 
AOTCTTTGCT 
CACAAGGCAG 
TOQAAATATA 



3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
40B0 
4140 
4200 
4260 
4320 
4347 



Seq ID HQs C97 DHA Sequence 

Nucleic Acid Accession XM_031379 
CoAtag sequence ; 148 .. 70 95 

1 11 21 31 41 51 

I I I } I I 

CACACATAOG CAOGCAOSAT CTCACTTOGA TCTATACACT GQAQGATTAA AACRAACAAA 
CAAAAAAAAC ATrrCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAA6CAG AGGAGCCGC3V 
CGGCGAGGGG CCQCAGACOG TCTGGAAATQ CQAATCCTAA AGGGTTTCCT CXSCTTGCATT 
CAGCTCCICX GXGTTTGCCX3 CCTGGATIGG GCTAATOGAT ACTACAGACA ACAGAGAAAA 
CTTGTTOAAS AGATTOaCTG GTCCTATACA GOAGCAClGA ATCAAAAAAA TTGGGQAAAa 
AAATATCCAA CATGTAATAG CCCAAAACAA TCTOCTATCA ATATTGATGA AGATCTTACA 
CAAOTAAATG TGAATCXTAA GAAACTTAAA TTTCAGGGTT BGGATAAAAC ATCATTGGAA 
AACACATTCA TTCATAACAC TOGGAAAACA 6TGGAAATTA ATCTGACTAA TGACTACCGT 
GTCftOamS GAGTTTCAGA AATGOTGTTT AAAiGCAAGCA AXBVTAACTTT TCACTGGGOA 
AAATGCAATA TCTCATCTGA TGGATCAG2^G CATACTTTAG AAG6ACAAAA ATTICCACII 
GAQATaCAAA TCTACTGCTT TOATGOTOAC COATTTTCAA GTTTTGAGGA AOCAQTCAAA 
GGAAAAGGGA AGTTAAGAGC TITATCCATT TTGTTTGAjGG TTGGGACAGA AGAAAATTIG 
GATTTCAAAQ CGATTATTGA TGGAQ^X?QAA AE3TGTrAC3TC QTTTTGGGAA GCAGGCTGCT 
TTAGATOCAT TCATACIGTT 6AACCTTCT5 OC3UULCTCAA CTGACAA6TA. TTACATITAC 
AATGQCTCAT TOACATCTrCC TCCCTOCACA OACACAGTTa ACTGOATTGT TTTTAAAOAT 
ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 
TrCTGQTTATG TCATQCTGAT GGACTACTTA CAAAACAATT TTGGAGAGC3^ ACAQTACAAG 
TTCTCTAOAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATXCATGA AGCAGTTTGT 
JUXTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAiGAGAATT ATAOCAGCCT TCrTGTTAiCA 
TGGQAAAGAC CTCGAGTOGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGtACCAG 
eAOTTGSATG OftiSAGGAOCA AAOCRAGOVT GAATTTTTGA CaOATBSOTA TCAAQACTTG 
GGTGCTATTC rcAATAATTT GCTACCCAAT ATSAGTTATG TTCTTCAGAT AGXAGCCATA 
XGCACTAATG GCTTATATGO AAAATACAOC GACCAAdOA TTGTGGACAT GCCTACTOAT 
AATOCTGAAC TTGATCTmr CCC»3ftATTA ATTGGAACTG AAGftAATAAT CAASSAGGAG 
GAAGAQGGAA AAOACATTGA AOAAOaOGCT ATTGTGAATC CTGGZAGAlSA CAGTGCTACSI. 
AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAiOOG 
AOGAAATACA AimAjQCCAA GACXAACCGA. tCOOCAACAA GAGGAAGIGA ATXCTCrGGA 
AAGQQTaATG TTCXX»ATAC ATCTTTAAAT TCCACTTCCC AAOCAGTCaC TAAATTAGOC 
ACAGAAAAAG ATATTTCCTT ^ClTCTCAG AdGTGMCIG AnCTGGCAOC TCACACTGIG 
GAAGGTACTT CAGCUTCiTT AAATSATGGC TCTAAAACTG TTCmuSATC TCCACA7AT8 
AACTTGTCOG OGACTGCAGA ATCCTTAAAT ACAGTTTCTA TA7VCAGAATA TGASGAGGAG 
AGTTTATTGA CCZAGTTTCAA GCTTOATACT GQAQCTGAAG ATICTTCAGO CTCCAGTCCC 
GCAACTTCTG CTATCCXMT CATCTCTeAG AACATATCCC AAQGGTATAT ATTTTCCTCC 
GAAAACOCAG AGACAATAAC ATATGATGf C CrCA13U3CAG AATCnXTCAG AAAXGCTTCC 
6AAGATTCAA CTTCATCAGG TTCAGAAQAA TCACTAAAGQ ATCCTTCXAT GQAGGGAAAT 
GT6TGGTTTC CTAGCTCTAC AGACATAACA GCACAGCXX3G ATGTTGGATC ACGCAGAGA6 
AGCTTTCrCC AGACTAATTA CACmSAQATA a3TGTT6ATG AATCTGAGAA OACAACCAAG 
TCCXTTTCTQ CAGGCCCSyGr GAOX^TCACAG GGTCGCTCAG TTACIU3ATCT 6GAAATGCCA 
CAnPATTCTA CCTTTGCCTA CTTCCCSVACT OAGGXAACAC CTCATGCTTT 73kCCCCAXCC 
TCCAGACAAC AG6ATTTGGT CTCCA03GTC AACGTGGXAT ACTCGCiVSAC AACOCAAOG6 
G1:J^ACAATG GTQAOACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCrCTCTAGTC 
ACCCCTTTGT TGCTTGACAA TCAGATOCTC AACACTACCC CTGCTGCTTC AAGTAGTGAT 
TQGGCCTTGC ATGCTACGOC TOTATTTOCC AOTOTCGAXG TGTCATTTOA ATCCATCCI^G 
TCTTGCIKIG AXGGKSCAGC ITTGCTSCJCA YXXTCCICTG CTTOCITCM3 XAGIGAAOITG 
TTT06GCATC TGCATACAGT TTCTCAAATC CTTCCACAAS TTACTTCAOC TACCGAQM?T 
GATAAGGTGC CCTTGCAXGC TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCCC 
AGCCTTGCTC AGTATTCTCJA TGxeCTGTCC ACTACXCATQ CTGCTTCAQA GACGCTGQAA 
TTTGGTAGTG AATCIGGTGT TCTTTATAAA ACGCITATGT TTTCTCAAGT TGAACCAOCX: 
AGCnSTGATO CCATGATSCA T6CAOGTTCT TCAGGGCCTB AAGCTTCTTA TGCCTTGrcr 
GATAATGASG GCTCCCAACA CATCTTCACT 6TTTCTTACA GTTCTGCAAT JtCCTCTGCAT 
GATTCTGTGG QXaXAACTTA TCAGGGTTCKI TTATTTAGOG GCCCTAiSCCA TATAOCAATA 
CCTAAQTCTT OGTTAATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 
GGTGATGGGG AATGGTCTGG AGCCTCTTCT GATACTCAAT TTCTTTTACC TGACACfiflAT 
QGGCTGACAG CCXHITAACSWT TTCTTCACCT GTTTCTaTAG CTGAATTTAC ATATACAACA 
TCTGronm GTGATGATAA TAAGQOGCTT TCXAAAASTG AAATAA1ATA TGGAAATGItf} 
ACCGAACTGC AAATTCCTTC TTTCAATGAG ATGOTTTACC CTTCTGAAAG CACASTCATG 
CCCAftCATGT ATGATAATGT AAATAAflTTG AATGOGTCTT TACAASAAAC CTCTGTTTCX: 

1256 



60 

120 

X80 

240 

300 

360 

420 

480 

540 

600 

660 

720 

7 SO 

840 

900 

96C 

1020 

1D80 

1140 

120O 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2260 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

ai2D 

3180 

3240 

3300 

3360 

3420 
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10 



ATTTCTAGCA CCAAGGGCAT GITTCCKSGO TCCCTTGCTC ATACCAjCCAC TAAGCSTTrTT 34 BO 

GAXCATGAGft TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCftACCTAC RCATACTOTC 3S40 

TCTCAAjSCAT CTGGTGACAC TTOGCTTAAA CCTGTGCTTA GTGCAAACTC AGAGCCAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

AOCTCMCTT CTTTTAOTAC TGAAGTATTG CTACAACCTT CCTTTCAG6C TTCTGATGTT 3720 

GAChCCTTGC TTAAAACTQT TCTTCCAGCT GXGCCCAGTG ATCCAATATT GGTTBAAACC 3780 

0CC2\AAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TT6TATCAAA TTCTGCTTCR 3B40 

AOTGAAAACA TGCTGCACTC TACATCTGTA CCAQTTTTTO ATGTQTCX3CC TACTTCTCAT 3900 

ATGCACrCTG CTTCftCTTCA AGGTXTGACC ATTTCCTATG CAAGTGAGAA ATATQAACCA 3960 

artTTGTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGfifi 4020 

tTGTTCCAAA GGGCCAATIT G6AGATIAAC CAGGCXXATC COCCAAAAGG AAGGCATG7A 4060 

TTTGCTRCAC CTQTTTTATC AATT6ATGAA CX»3TAAATA CACTAATAAA TAAfiCTTATA 4140 

CATTCCX3ATG AAATTTTAAC CTOCACXiAAA AGTTCTGTTA CTGGTAAQGT ATTTGCTGGT 4200 

^ _ ATTOCAACAfi TTGCTTCXGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 

GGGCATQTTG CCATTACAGC TOmrCTCCC CACAGAGATG GrrClOTAAC CrrC3UU:AAAG 4320 

TTGCXCTTTC CTTdAAGGC AACTTCT(^ CTGAGTCATA <3TGCCAAATC TGATGCOQGT 43 BO 

TTftSTGaOTG GTGGTGAAGA TOGXGACACI GATGAT6ATG GTGATGATGA TGATQATGAC 4440 

ASAGGTAGTG ATGGCTTATC CATTCATAAG TGTATGTCAT GCTCATCCTA TAGAGAATCA 4500 

CAGGAAAAGG TAATGAATGA TTCAGACACC CACGAAAACA GTCTTATGGA TCAGAATAAT 4S60 

OCAATCrCAT ACTCACTATC TGAOAATTCT GAAGAAGATA ATAGAGTCAC AAGTGZATCC 4ti20 

TCAGACAGTC AAACTGGTAT GGACAGAAGT OCXGGXAAAr CACCATC3U5C AAATGE3GCTA 4«aO 

TCOCAAAAGC ACAATGATGO AAAAQASGAA AATGACATTC AGACTGGTA6 TGCTCIGCTT 4740 

CCTCTCAGCC CTGAATCTAA AGCATGGGCA GTTCTGACAA GlGATGAAfiA AAGTGQATCA 4800 

QGGCAAGGTA CCTCRfiATAG CCTTAATGAG AATQAQACTT C3C3UZAQATTT CAGTTTTGCA 4B60 

GACACXAATG AAAAAGATGC TGATGGGATC CTGGCAGCAG GTGACICAGA AATAACIGCT 4920 

GQATTCCCAC AGTOOCCAAC ATCATCTGXTi ACXAGGGAGA ACICAGAAGT tSTTCCACGTT 4980 

TCnSAGGCSUS AfiCSGCftOTAA TA^AGCXSLT GAGTCTC33TA TTGGTCTAGC TGAGSGGTTG 5040 

GAATCXJBAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTOT 5X00 

CTACTGGTTC i:tgtgggtat tctcatctac tgoaqqaaat gcttccagac tgcacacttt si 60 

TACTTAGAGG AOWSTACATC CCCTAGAGTT ATATCCACAC CTOCAACACX: TATCTTTCCA 5220 

ATTTCAGAT6 AT6T0GGAGC AATTOCAATA AAGCACTTTC: CAAAGCATGT TQCAGATTTA 52B0 

CATGCAAGTA GTQQQTTTAC TGAAGAATTT GAGACAC?rGA AAGAGTTTTA CCA6GAAGTG 5340 

CAGAGCTGTA CTGTTeACTT AGQTATTACA GCAGACAGCT CCAACCSiCCC AGACRACAAG* S40O 

CACAAGAATC OATACATAAA TATCGTTGCC TATGATCATA GCAGGGTTAA GCTAGCACAQ 546P 

CTTGCTGAAA AGGAXGGCAA AjCTGACTGAT TATATCAATG CXZAATTATST TGATGGCTAC £520 

AACAGACCAA AAGCITATAT TSCTGOCCAA GGCCCACIGA AATCCACAGC TGAAfiATTTC 5580 

TOGAGAATGA TATGGGAACA TAATGT6GAA GTTA3TGTCA TGATAACAAA CCTOGTGGAG 5^40 

AAAGGAAGQA QAAAATGTGA TCRGTACTGG CCTGCCJGATG GGAGTGAGGA GTACX3GGAAC 5700 

TTTCTGGTCA CTCA6AA6AG TGrGCAAGTG CTTGCCTATT ATACTGTQAa GAATTTTACT S760 

CTAAGAAACA CAAAAATAAA AAAGGGCnXX: CAGAAAG6AA GACCCAGTG6 AOSXGlGGTC 582Q 

ACACAGTATC ACIACAGGCA GTGGXX:T6AC ATGGGAGTAC CAiaAGXACTC CCTISCCAGTG S880 

CTGACCTTTG TGAOAAAOQC AGCCTATGOC AAGOSCCATG CAGTGGOGOC TGTTGTOGTC 5940 

CACTGCAOTG CTGGAGTXGG AAfiAACS^iOeC ACATATATTG TGCXAGACAO TATGTTGCAG 6000 

CAjGATTCAAC ACQAAGQAAC T^rCAACATA TTTGGCTTCT TAAAACACAT CCGTTCAC3UV €060 

AQAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATOKTAC jUrTGGTTGAG £120 

GGCAtACTTA GTAAAOAAAC TGAGGTOCTS GACAGTCATA TTCATGOCIA TGTTAATGCA 6180 

CTCCTCATTC CIGQACCAlGC AGGCAAAACA AAGCTAGASA AACAATTCCA GCTCCTGAGC £240 

CAGTCAAA*I!A TACAGCRSAS TGACTATTCT GCAGCOCTAA AGCAATGCAA CaOOGAAAAG 6300 

AATCOAACTT CTTCTATCAT CCCTGTGGAA AGATCAAGGG TTQGCATTTC ATCCCTGAGT 63€0 

GGAGAAGGCA CAGACTACAT CAATGCCTC3C TATATCATGG GCIATTACCA GAGCAATOAA 6420 

nrCATCATTA COCAGCACGC TCICCITCAr 2^CATCAAGG ASTTCTSGAG GATOATATGG 6480 

GACCATAATG CCCRACXGGT GGTTATQATT CCrtSOQQCC AAAACATGGC AGAA8ATGAA 6S4D 

TTTGXTTACT GOCCAAATAA AGATGAGCCT ATAAATTGTG AGAGCTTTAA GGTCACTCTT 6600 

ATGGCTOAAG AACACAAATG TCTATCTAAT GA£3GAAAAAC TTATAATTCA GGACITTATC 6660 

TTAGAAGCTA CSVCZAGGATGA TTATGTACTT GAAGTGAGGC ACTTTCAGTG TCCTAAAXGQ €720 

OChAATGCAS ATAGGCGCAT TCHGIMSMCX TCTGARCTXA TAA6TGTTAT AAAAGAAGAA 6780 

GCTGGCAATA GOGAIGSGOC TATGATTGTT CATGATGASC ATGGAGGAGT GACGGCAiaaA 6B40 

ACTTTCTGTG CTCTGACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC COTGQATGTT 6900 

TACCAGGTAG GCAAGATGAT CAATCTGATQ AGGCCAGGAG TCTTIiGCrGA CXTTGHG/CRG 6560 

TAICnCTTTC TCTACAAAGT 6ATGCICAGC CTTGIGAGCA CAAGGCAGGA ACSAGAATCCA 7020 

TCCACCTCTC TGGACAGXAA TGGTGQllGCA TTGCCTGATQ GQWAXATAOC TIQAGIUSGTTA 7080 

GAdSTCrtTAB TTTAACS^CAG AAAGGGGTGG OOGGACTCAC ATCXiQAQCAT TGTTITOCTC 7140 

TTCCTAAAAT TAGGCAGGAA AATCAGTCTA GTTCTGTTAT CTGTTGATTT CCCATCACCT 7200 

GACAGXAACr TTCAIGACAT AQCSATTCTGC CQCCAAATTT ATATCATTAA CAATGTGTGC 7260 

CTTTTTOGAA GACTTGTAAT TTACTTATTA T6TTTGAACC AAAATGAT7G AATtTTAC^ 7320 

TATTTCTAAS AATQGAATIG tOCSXATSTET TTCT6TATTO ATTTiaACAlS AAAATTTCAA 7388 
T7IA1»GAGQ TTAQGAATTC CAAACXACAG AAAAT6TTTG TTTTTAiGTGT CAJVATTITIA 7440 
GCTGTATTTG TAGCAATTAT C3U3GTTTGCT AGAAATATAA CTTTTAAXAC AGTAGCCTGT 7500 
AAATAAAA£» CTCTTCCftTA TGATATTCAA CATTTTACAA CTGCAOTATT CACCXAAAGT 7S60 
AGAAATAATC TGTTACITAT TGTAAATACT GCCCTAGTOT CrTCCATGGAC CAAATTTATA 7620 
TTI3WTAATTG TAGATCTTTA TATTTTACTA CIGAdSTCAAG TTTTCXAGTT CTGIGSAAXT 7680 
GTTTAGTTTA ATGAOGTAGT TCATTAGCTG GTCTTACTCT AGCA6TTTTC TGAC3VTTGTA 7740 
TTGTGTTACC TAAGTCATTA ACTTTGTTTC AGCATGTAAT TTTAACrTTT GTGGAAAATA 7800 
OAAAPACCTT CATTTTGAAA GAAfiTTTTTA TGAQAATAAC ACCTTACCAA ACATTOTTCA 7860 
AAXGGTTTTT ATCCAAGGAA TTGCAAAAAT AAATATAAAT ATTGCCAXXA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 7944 
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CAAAAAAAAC ATTTCCITOG CTOCXTCCTCC CTCCCCACTC TGAQAAGCAG AGG»8CGGCA 120 

CGGCXSAfiGGO CC33CAQACCG TCTGGAAATG CGAATCCTAA AGOC5TTTCCT CGCTTGCRTT IBO 

CRGCTCCTCT GTGTTTGCCG CCTGGATTGG GGTAATGGAT ACTACAGACA ACAGRGAAAA 240 

CTTGTTGAAG AGATTQC3CTG GTCCTATACA GGAGCACTcaA ATCAAAAAAA TTGGGGAAAG 300 

5 AAATATCCAA CATGTAATAQ CCCAAAACAA TCTOCTATCA ATATTGATGA AGATCXXACA 3G0 

CAAGTAAATG TGAATCTTAA 6AAACTTAAA TTTCABGSIT GG6ATAAAAC ATCATTGQAA. 420 

AACACATTCA TTCATAACAC TGGC3AAAACA GTGGAAATTA ATCICACTAA TQACTAC06T 4B0 

QTCAGCJQQAa QAGTTTCAGA AATGGTGTTT AAAGCAAiGCA A(3ATAACTTT TCACIGGGGA 540 

AAAtGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAQGACAAAA ATTTCCACTT 600 

10 GAGATGCAAA TCTACTGCXT TGATGCGQAC CGATTTTCAA GXTTTGACeSA AGCAGOTCAAA GEO 

G(3AAAA66GA AGTTAAGAGC TTTATCCATT TTOTTTGAlSG TTOGSACAGA AGAAAATTT6 720 

GATTTCAAAG GBATTATTGA TGGAGTCGAA AGTGTTAQTC GTTTTGGGAA GC^fiGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGRCAAGTA TTACATTTAC 840 

^ AATGGCTCAT TQACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAG&T 90 0 

15 ACAGITAGCA TCTCTGAAAG CCAjeTTGGCT GTTTTTTGTG AAGTTCTTAC AA7GCAACAA 960 

TCTGGTTATQ TCATGCTGAT GGACTACTTA CAAAACAATT TTGGAGAGCA ACaGXJVCAAG 1020 

TTCTCtAGAC AGGTGTTTTC CTCATACACT QOAAAGGAAG AGATTCaTGA AGCAOTTIGT 10 BO 

AGTTCAOAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAQCCT TCnTGTTACA 1140 

TGGGAAAGAC CTOGAGTCQT TTATGATACC ATGATTGAGA AtSTTTGCAGT TTTGTACCRfi 1200 

20 CRGTTGGATG GAGAOGACCA AAOCAAQCAT GAATTTTTQA CflfiATGGCTA TCAAiCSACTTG 1260 

GGTGCIATTC TCAATAATTT GCTACCCAAT ATQAGlTATG TTCTTCAGAT A6XAGCCATA 1320 

TGCACTAATQ OCTTATATGG AAAATACAGC GACCAACTGA TIGTOGACAT GGCTACTGftX 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTOGAACTG AASAAATAAT CAAGGAGQAG 1440 

GAAOAaOOAA AAGACATTGA AQAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTCCTACA 1500 

25 AACCAAATCA OQAAAAAGGA ACOCXAOATT TCTACCACAA CACACTACAA TGQCATAGGG 1S6Q 

AGQAAATACA ArTGAAGCCAA GACTAAGCGA TCCCCAACAA (SUSOAAGTGA ATTCrCTGGA 1620 

AAGGGTCATG TTCCCAATAC ATCTTiaAAT TCCACTTCCC AAOCAGTCAC TAAATTASOC 1580 

ACAGftAAAAG ATATTTGCTT GACTTCTCAG ACIGXGACTG AACTGCCAOC TCACAITrOTG 1740 

GAA/GOTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAQATC TCCACATATG 1800 

30 AACX!CGTCOG GGACTQCAGA ATCCTTAAAT ACAGTTTCTA lAACAGAATA TGAGGAGGAG 1B60 

AOTTTATTGA OCAGTTTCAA QCrXGMACT GGAGCTGaAG ATTCTTCAGG CXGGAGTOCC 1520 

GGAACTTCTG CTATCOCAT^T CATCTCTGAG AACATATCCXZ AAG6GTATAT ATTTTCCTCC 19^0 

GAAAACXXIAG AGACAATAAC ATATQATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

G2iAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

35 QPTOTGaTrrC CTAGCTCTAC AGACATAACA GCACAQCCCSG ATGTTGOATC AGGCAGAGAG 2160 

ASCTTTCTCC AGACTAATTA CACTGAQATA CGTGTTGATQ AATCXGAGAA GACAAiCCAAG 2220 

TCCTTTTCIG CAGGCCCA6T OATGrCACAG GGTCCCTCAfi TTACAOATCT 6GAAATGCCA 2280 

CATTATTCTA CCTTTGCCIA CTTCCCAACT GAGGTAACAC CTCSCTGCTTT TACCCCAICC 2340 

TCCAGACAAC AGGATTTGGT CTCCSiCGGTC AAOGTGGTAT ACTCGCAGAC AACCCAACOS 2400 

40 6TATACAATG CAOAOGCCAG TAATAOTA0C CATGAdGTC?rC GlATTGGTCT AGCTGAGGGG 2460 

TTGGAIVrCCG A8AAGAAGGC AflXmEAOCC CTXGTGA'irOG TGTCAQGCCT GACITTTATC 2520 

TGTCTAGTCG TTCTTGTGG6 TATTCTCATC TACT06AGQA AATGCTTCCA GACTGCACAC 2580 

TTTTACXTAG AGGACAGTAC ATCXXXZTAGA GTTATATOCA CA£X:TCC:AAC ACCTATCTTT 2640 

CCAATTTCAG ATQATGTCX3G AGCAATTCdA ATAAAGCACT TTCCAAAGCA TOTTQCAGAT 2700 

45 TTACATOCAA GTAGTGGGTT TACTGAAGAA TTTGAQAOVC XGAAAGAGTT TTAOCAGGAA 2760 

GTGCAGAGCT GTACTOTTQA CTTAGGTATX ACAdSCAGACA GCTCCAACCA CCCAGACAAC 2820 

AAGCSICAAGA ATCGATACAT AAASATC3STT GCCTATSATC ATAG^UaaOT TAAGCTAGCA 28B0 

CAGCTTGCTG AAAABQATGG CAAACTGACT GATTATATCA ATGOCAATTA TOTTQATGGC 2940 

TAC3U\CAGAC CAAAAGCTTA TATTGCTGCC CAAOQCCCAC TGAAATCCAC AGCTGAAGAT 3000 

50 TTCIX3GAGAA TOiWIATGGGA ACATAATOTG GAAGTTATTQ TCAXGATAAC AAAGCTOGIG 3 060 

GAGAAAGGAA GGAGAAAATS TTGATCAGTAC TSGCSCTGCCG ATOaOAGlXSA GGAGTACOGG 3120 

AACTTTCTG8 TCACTCAGAA GAGTGTGCAA GTOCTCaCCT ATTATACTGT GAGGAATTTT 3180 

ACrCTAAGSVA ACACAAAAAT AAAAAAGGGC TCCCAGAAAG GAAGAGCC2U3 TGGACBTGTG 3240 

QTCACACAGT ATCACTACAC GCAQTGGCCT GACATQGGAG TAOCAGAQTA CSTCCCTGOCA 3300 

55 GTGCTGACCT TTGTGAGAAA GGCAGCCIAT GCCAAGOGCC ATGCAGTGGG GC3CTGTTGTC 3360 

6TCCACIGCA GI6CTGGAOT TGGAAGAACA GQCAfSlXATA TTGTGCTAGA CAGT ATGTT G 3420 

CAGCAGATTC AACAOGAAGG AACTGTCAAC AXATTTG6CT TCTEAAAACA CATCOSTTCA 3480 

CaU\AGAAATT ATTT6GTACA AACTGAGGAG CAATAlGrCT TCATTCAIGA TACACTGOIT 3540 

GAGGOCATAC TTAGTAAAGA AACTGABGTG CrQGAC3«3TC ATATTCATGC CTATGWAAT 36 OO 

60 GCACIOCrCA TTCCTGGACC AGCAGSGAAA ACAAAGCTAG AGAAACAATT CCAGCTCCIG 3660 

ASOQU?TCAA ATATAGAGCA GAGTOACXAT TCTGCAGCXX: TAAAGCAATG CSkACAGGGAA 3720 

MOMaOS/A CTTCTTCTAT CATCOCrTGTG GAAAGATCAA GGGTXGGCAT TTCATCXCTG 3780 

AGOTGGAGAAG GCACAGACTA CATCS^TGCC TOCTATATCA TGGGCTATTA CCAGAGCAAT 3840 

GAATTCATCA TTACXJCAGCA CCCTCTCCTT CATACCATCA AGGATTTCTG GAGGATGATA 3900 

05 TGGGACCAIA ATGOCCAACT GGTGGTTA1G ATTCCITGATG GCCAAAACAT GGCAGAAGAI 3d 60 

GAA7TTGTTT ACTGGOCAAA TAAAGATSAS OCTATAAATT GTGAGAGCCT TAAGGTCACT 4020 

CTTATOGCTG AAQAACACAA A.TGTCTATCI AATGAGGAAA AACTTATAAT TCAGGACITT 4080 

ATCTTAGAAG CTACACAGGA TQATrATGTA CTTGAAGTGA GGCACTTTCA QI61^CCXAAA 4140 

'CGGCC3MTC CAGATAGCOC CATTAGTAAA ACTTTTGATiC TTATAAGTGT TATAAAAGAA 4200 

70 GAAGCXGCCA ATAGGGATGG GCCTATGATT 6TTCATGATG AGCATGGAG6 AGTGAC6GCA 4260 

GQAACTTTCr GTGCICTGAC AACGCXKATG CACCAACCAG AAAAAGAAAA TTCCGTGGAT 4320 

GTTTACCAGG TAGGCAAOAT GATCaWTCTG ATGAGGCCAO GAGTCTTTGC TGACATIGAG 4380 

ca£3TATCAGT TTCTCTACAA AGTGATCCTC AGCCTTGTGA GCACAAQGCA G6AAGAGAAT 4440 

CCATCCACCT CTCTGOACAS TAATGGTGCA GCATTGOCTG ATGGAAATAT AGCTGACAGG 4500 , 

75 TTAfifiGTCTT TAICTTTAACA CAGAAAQQ3Q TGGGOOGACI CRCATCTGAG CATTGTTTTC 4560 

CTCTTGCTAA AATTAGGCAG GAAAATCAGT CXAGTTCTQT TATCTGTTGA TTTCCX3vrCA 4620 

CCTGACAGTA ACXTTCATCSA CATAGOATTC TBCCGCCAAA TTTATATCAT TAACAATGTG 4680 

TSOCTTTTTG CAAGACTTGT AATTTACTTA TTATGTTTGA ACTAAAATGA TTOAArrTXA 4740 

CAGTATTTCT AAGAATG6AA TTGlGGTATT TTTTTCTGTA TTQATTTX3\A CAGAAAATTT 4800 

80 CAATTTATAO AGGTTAGGAA TTCCAAACTA CAGAAAATGT TTGTTTTTAG TGTCAAATTT 4860 

TTAGCIGTAT TTGTAGCAAT TATCAGGTTT GCTAGAAATA TAACTTTTAA TACAOTAGCC 4320 
TGTAAATAAA ACACTCTTOC ATATGATATT CAACATTTTA CAACTGCAGT ATtCACCTAA 49BD 

AGTAGAAATA ATCIlGTTACrr TATIGTAAAT ACTGCCCTAG TGTCTCCATG GACCAAATTT 504O 

AxamrTATAA ttgtagattt ttatattita ctacigagtc AAGrrrrcTA gttgtgtgta SlOO 
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ATTGTTTAGT TTAATQAO&T ABTTO^TrAG CTGGICTTAC TCTACCAOTT TCCSGRCKTr 5160 

GTRTTGTGTT ACCTAAGTCA TTAACTTTGT TTCAGCATGT AATTTTAACT TTTGTGGAAA 5220 

ATAGAAATAC CTTCATTTTO AAAOAAGTTT TTATSAOAAT AACACCTTAC CAAACATIST 52 BO 

TCRAATGGTT TTTATCXMG GAATTGCRAA AATAAATATA AATATTOCX3V TTAAA&AftAA 5340 

AAA^AAAAAA AAAAAAAAAA AAAAAAA 53^7 

Seq ID HO I C99 UNA Sequence 

Nbcl«ic Acid Atxreseion tt; Bos sequence 

Coding sequences 501..4S14 

1 H 21 31 41 51 

] ] \ ] \ \ 

CaCACRTACG CRCt3CA03AT CTCACTTOGA TCTATACACT GGAGGATTAR AACAAACAAA 60 

C2VAAAAAAAC ATTTCCTTCG CrrCCXXJCTCC CTCTCCACTC ^GA£3AASCAG AOGAGCCGCA 12 0 

COQCGAGGOa CCX3CAGACCG TCTGGAAATG CGAATCCTAA AGOTTTTCXZT CQCTTGCATT 180 

CAGCTCCTCr GT6TTTGCOG CCTGGATTGG GCTAATGGAT ACIACAGACA ACAGAGAAAA 240 

CTTOTTGAAG AGATTGOCTG QTCCTATACA GG&GCACTGA ATCftAAAAAT TaaOGAAAGA 300 

AATATCCAAC ATGTAATAGC CCaU^AACAAr CTCCTATCAA TATTGATGA^ GATCTTACAC 3€0 

AAGTAAATOT GAATCTTAAQ AAACTTAAAT TTCAQGGTTa QGATAAAACA. TCATTGOAAA 420 

ACACATTCA-r TOVTAACACT GGGAAAACAG TGGAAAOTAA TCICACTAAT GACTACC&XG 480 

TCAQCBCaAQG AOTTTCAOAA ATGGTGTTTA AAOCUUCSCkA GATAACTTTT CACTGCSSOl^A 540 

AATGCAATAT GTCATCTGAT QGATCAGftSC ATAGTTTAGA AGGACAAAAA TTTOCACTTG COO 

AGATGCAAAT CTACTOCTTT GATaCDGACC OATTTTCTkAG TTTTGAGGAA GCAGTCAAAS 660 

GAAAAGGGAA GTTAAGAGCT TTATCCATTT TGTTTGAGGT TGGGACAGAA GAAAATTTG6 72 D 

ATTTCAAAGC GATTATTQAT GGAOTCOAAA GIGTTAGXCX3 UTIGGGAAG CAGGCTGCXT 7B0 

TAOATCCATT CATACT6TT6 AACCTTCTOC CAAACTCILAC TGACAAQTAT TACATTTACA 84 Q 

ATGGCTCATT GACATCTCCr OOCTGCACAG ACACA6TTGA CTGGATTGTT TTTAAAGATA 900 

CAGTTAOCAT CTCTGAAAQC CASTTGGCTG TTTTTTGTGA AGTTCTTACA ATQCTiACAAT 9fiO 

CTGGTTATGT CATGCTGATG GACTACTTT^ AAAACAATTT TCGAGAGOIA CAGTACAAGT 1020 

TCTCTAGACA GOTGTTTTCC TCATACACTG GAAAGGftAGA GATTCATGJU^ GCAGTTTGTA 10 BO 

OTTCaGRACC A5AAAAT6TT CAQGCTGACC CAGACSRATTA TACCAGCCTT CTTGTTACftT 1X40 

GGGAAAGACC T0GAGTCX3TT TATG2VTACX!A TGATTGAGAA GTTTGCAGTT TTGTACCAGC 1200 

AGTTGGATGG AGAGGACCAA ACCAAGCATG AATTTTTGAC AGATGGCTAT CRAGJICTTGG 1360 

GTGCTATTCT CAftTAATTTG CTACCCAATA TGAGTTATGT TCTTCAGATA GTAGCCSITAT 1320 

GCACTAATGG CTTATATGGA AAATACAGOG ACXAACTGAT TGTCGACATG CCTACTGATA 1380 

ArtOCTGAACX TGATCTETTC CCTGAATTAA TTCSGAACTGA AGAAATAATC AAGGAGGABG 1440 

AACTAGGGAAA AGACATTGAA 6AAG60GCTA TTGTGRATCC TGSTA6AGAC AGTIGCTAGAA 1500 

AlXaUATCAG GZiAAAAGGAA CCCCRGftTTT CTACCACAAC ACACTACAAT OGCRTAGGGA ISfiO 

CGAAATACAA TGAAGCCAAG ACTAACCGAT CCOCAACAAt? AGOAAOTGAA TTCTCTGGAA 1620 

AGGGlGATGfT TCCCAATACA ZCTTTAAATT CCACTTCCCA ACCftGTCACT AAAXTAGCCA 16B0 

CSAOAAAAAGA TATTTOCTT6 ACTTCTCAGA CTGTGACTGA ACTGCC&OCT CACACIGTOG 1740 

AASGtACTTC AGCCTCITTA AATGATGGCT CTAAAACTGT TCTEIIGATCI CCACATATGA 1800 

ACTTQTOGGG GACTGCAGAA TCCTTAAATA CAGTTTCTAT AACAQRATAT GAGGAGGAGA 1860 

GTTTATTGAC CASTTTCRAS CITGATACTG GAGCltSAAGA TTCTTCAQGC TCCAGTCCCG 1920 

CAACTTCTGC TATCCCATTC ATCTCIGAGA ACATATCCCA AGGGTATATA TTTTCCTCOS 1980 

AAAACCCAGA GACAA7AACA TATGATQTC3C TTATAOCA8A ATCTGCTAGA AATSCITOCB 2040 

AAOATTCAAC TTGATCAGGT TCU3AAGAAT CM^'AAAGGA TC3CTTCTATG 6AS06AAAT6 2100 

TGTOGt!nCC TAGCTCTACA GACATAACAG CACAGCCCGA KTT<3GATC!A GGCAGAGaGA 2i£0 

GCTTTCTGCA GACTAATTAC ACTGAGATAC GTGTTGATt3A ATCTGAGAAG ACAACCAAGT 2X20 

OCTTTTCTGC ASGCCX:AGTG ATGTCAC3^G6 GTCOCTCAGl: TACAGATCTG GAAATIGOCAC 2280 

ATXATTCTAC CTTT6GCTAC TTCXICAACTG AGGTAACAGC TCATGCITTT ACCCCATCCT 2340 

CCMSUMCA GGATTTGGXC VCCIVCOGTCA A0GTG6TATA CTOSCAGACA ACGC^AOOGG 2400 

TATACAATGA 6GCCZAC3TAAT AGTASCCATO AOTCTCnTRT TOOTCTAaCT GAGGGCTTTBG 2460 

AATCXSaAGAA GAAGGCRGTT ATACOCCTTG TGATCXSTGTC AGCCCTGACT TTTATCTGTC 2520 

TAGTGGtTGT rtGTGGGTATT CTCATCTACT GGAGGAAAIG CTTCCZWQACT GCRICMCTTTT 25B0 

ACTTAIlAaaA CAOTS^TCC OCTAGM5TTA TATOCACSU:C TOCAACACXZT ATCTTTCCAA 3640 

TTTCAGATQA «3TC(3GAGCa^ ATTGCAATAA AOCACITTOC AAAGCATGTT GCACIATTTAC 27D0 

ATGCAAOTAa TGGGTTTACT GAAGAATTTG AGACACTGAA AGAGTTTTAC CAGGAAGTGC 2760 

AGAGCTGTAC TGTTGACrCA GGTRTTACRG CAGACAGCTC OiAOCACCCft. QACftACaAGC 2820 

ACftAa21ATG6 ATACATAAAT ATGGTTOCCT ATQATCATAQ CAGGGTTAAG CTAOCACAGC 2880 

1:T(3CXGAAAA GGAXGOCAAA CIGACT6ATT ATATCAATGC GAATTATGTT GATGGCTACA 2940 

ACACSkCSCAAA AGCTTAXAIT GCTQCCCSUIO GCCCACTGftA ATCCACAGCT GAnSATTTCI 3000 

QGAOAATGAr ATGGGAACAT AATGTGGftAG TXATTGTCM QATAACAAAC CTCGTGGAGA 3060 

AAGGAAGGAG AAAATGTGAT CAGTACTGGC CFGCOSATGG GAGTG&OGAG TACGGGAACT 3120 

TTCTGGTCSU: TCAOAAGABT GTGCAAGTGC TTQCCTATTA TACTOTOAGG AATTTTACTC 3180 

!S!AAGAAACAC AAAAATAAAA MOaBBCECCC AG2UMGGAA6 ACGCAGTGGA OSTGIGGTCA 3240 

CSiCABTATCA CTACACGCRG TOQCCTGACA TOGGAOTACC AGASTACroc; CTGOCROTQC 3300 

T6AOCTTTGT GAGAAASGCA GGCTATGCCA AGOGCCATGC AGIGGGGCCT GTTGTCGTOC 3360 

ACTGCAOTGC TGQAOTTaaA AGAACAGGCA CATATATTOT QCTA£3ACAiOT ATGTTQCAGC 3420 

AGArrTOVAQ^ CGAAS3GAACT GTCAACA.TAT TTOGCTTClrT AAAACACATC CGTTCACJUUl 3480 

QftAATTATTT GSTACAAACT GAGGAGCAAT ATGICTTCAT TCATGATACA CTGGTTGAGG 3S40 

CCATACTTAG TAAAGAAACT GAGOTSClGa ACSUSTCATAX TCATGOCTlUC GTTAATGCAC 3600 

TCCTCATTCC TG6ACCAGCA GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTOCTGAGCC 3 6 60 

AGTCAAAIIAT ACAGCAGAGT GACTATTCTG CAGCGCTAAA GCAATGCAAC AGGGA/IAAGA 3720 

ATOQAACTTC TTCTATCATC OCTGTGGAAA GATCAAGGGT TOaCATTTCA TCCCTGA6TG 3780 

GAGAAGGCftC AGACTACArrC AATGCCXCCT ATATCATGGG CTATTACQ^ AfiCAAlGAAT 3840 

TCATCATTAC CX:AGCAiOCCT CTCCTTCATA CCA.TCAAGGA TTTCTOQAOG ATGATA7GGG 3900 

ACCATAATGC CCAACTGGTG Gn!MX3ATTC CTGATOGOCA ARACATOGCA GAAGATGAAT 3960 

TTGTTTACTQ QCCAAATAAA GATGAGCCTA TAAATTSTGA GAGCXTTAAG GTCACTCTTA 4020 

TGOCTGARGA ACACAAATGT CTATCTAATG AGGAAAAACT TATAATTCAG GACXTTAXCT 4080 
TAGAAGCTAC ACAGGATGAT TATGTACTTQ AAGTSAGSCA CrTTCAGTGT CCTAAATOaC 4140 

CAAATCCAGA TAGCCCCATT AGTAAAAdT TTGAACTTAT AACTTGTTATA AAAGAAGAAG 4300 
CTQCX»ATAG GGATGGGCCT ATGAaTEGTTC ATGATGAGCA TGGAGGAST6 AiCGGCAGQAA 42G0 
CTTTCTGTBC TCTGACAACC CTTATOCACC AACTAOAAAA AGAAAATTCX: GTCGATSTTT 4320 
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ACCAGGTAGC 
ATCACTTTCr 
CCaCCXCTCT 
AQTCTTTAGT 
TCCTAAAATT 
ACASTAACTT 
TTTTXGCAAG 
ATTTCTA2iGA 
TTATAGAGQT 
CTGTATTTGT 
AATAAAACAC 
GAAATAATCI 
TTATAATTGT 
TTTAGTTTAA 
TCTQTTACCT 
AAATAGCTTC 
ATGGTTTTTA 
AAAAAAAAAA 



CAAGATGATC 
CTACAAAGTC 
GGACAGTAAT 
TTAACACAOA 
AGGCAGGAAA 
TCATOACAXA 
ACTT8TAATT 
ATGGAATTGT 
TAGGAATTCC 
AGCAATTATC 
TCTTTCCATAT 
GTTACTTATT 
AGATTTTTAT 
TGA£X5TAQTr 
AAGTCATTAA 
ATTTTGAAAG 
TCCAAGGAAT 
AAAAAA7UUU\ 



AATC7SATGA 
ATCCTCAGCC 

GGTGCAGCAT 
AAGGGGTGGG 

GGATTCTGCC 
TACTTATTAT 
GGTATTTTTT 
AAACTACAQA 
AGGTTTGCXA 
GATATTCAAC 
GTAAATACT6 
ATTTTACTAC 

CTTTGTTTCA 
AAGTXTTrAT 
TGCAAAAATA 

AAA 



GGCCAGQAGT 
TTGTGAGCAC 
TGCC1X5ATOQ 
(5C3GACTCACA 
TICTGTTATC 
GCCAAATTTA 
6TTTGAACTA 
TCT6TATTGA 
AAATGTTTQT 
GAAATATAAC 
ATTTTACAAC 
CXrCTAGTGlC 
TGAGTCAAGT 
TCTTACTCTA 
GCATGTAATT 
GAdGAATAACA 
AATATAAATA 



CTTTGCTGAC 
AAGGCAGGAA 
AAATAITAGCT 
TCTGAGCATT 
TGTTGATTTC 
TATCATTAAC 
AAATGATT6A 
TTTTAACAGA 
TTTTAGTQTC 
TTTTAATACA 
TGCACnATTC 
TCCA-FGGACC 
TTTCTTACSTTC 
CCAGTTTTCX 
TTAACITTTQ 
CCTTACCAAA 
TTGCC^TTAA 



ATTGAGCAGT 
GAGAATOCAT 
GAOAGCTTAG 

cmTCCTcr 

CC3VTCACCTG 
AATGTGTGCC 
ATTTTACAGT 
AAATTTCAAT 
AAATTTTTAG 
GTAQCCTGXA 
ACCTAAAGTA 
AAATTTATAT 
TGTGTAATT6 
GACATTGTAT 
T06AAAATAG 
CftTTGTTCAA 
AAAAAAAAAA 



8eg ID NO: CLOO DNA. Sequence 

nucleic Acid Accession tt? Bos sequence 

Coding sequence: 14B. .43S2 



43 BO 
44.40 
4300 
4560 
46Z0 
4680 
4740 
4800 

4eeo 

4920 
4980 
5040 
5100 
516D 
5220 
52B0 
5340 
53«3 



1 11 21 31 41 51 

t I I I I I 

CACACATACG CAGGCAGGAT CTCACTTOGA TCTATACACT GGAGGATTAA AACAAACAAA 
CAAAAAAAAC ATTTCCTTCG CTCCCCCTC3C OTCTCC3lCrC TGAGAAGCAG AGGAGCCGCA 
OQGGGAGGGG CCGCAGACCG TCTGGAAATG OQAATtOTAA AACOTTTCCT OSCTTGCATT 
CA(3CTC!CTCr GTGTTTGCCG CCTGGATTQG GCTAATGGAT ACTACAGACA ACASAGAAAA 
CTTGTTGAAG AGATTGGCTG GTCCTATACA QGAGCACTOA ATCAAAAAAA TTGOGGAAAG 
AAATATOCAA CATGTAATAG OCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 
CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGSTT GOGATAAAAC ATCATTGGAA 
AACACATTCA TTCATAACAC TOSSAAAACA GTGSAAAT7*A ATCTCACTAA TGACTACC6T 
GTCAGCOGAQ QAOTTTCJW3A AATGQTGTTT AAAOC&AQCft. AOATAACTTT TCACTGGGGA 
AAATGCAATA TGTCATCTQA TGGATCAGT^ CATAGTTTAG AAGGACAAAA ATXTCCACXT 
GAGATGCAAA TCTACIGCTT TQATGOBOAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 
OGAAAAGGGA AGTTAAGAGC TITATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTXG 
GATTTCAAAG CGATTATTGA TOGAGTCQAA AGTOTTAGTC (Ji - XVi OgGAA GCAGGCTtaCT 
TTAGATCCAT TCATACTGTT 6AACCXTCTG OCAAACTCAA CTGACAA6TA TTACATTTAC 
AATOGCTCAT TGACATCTCC TCCCTGCACA. GACAGAGTTG ACTGGATTGT TTTTAAAOAT 
ACHBTTAGCA TCTCTGAAAG CGAGTTGGCT GTTTTTTTGT GAAGTTCTTA CAATGCAACA 
ATCTGGTTAT GTCSVTGCTOA TGQACTACTT ACftAAACSU^T TTTGQAGAGC AAOIBTACAA 
GTTCrCTAGA CAGGT6TTTT CCTCATACAC TCGAAAGOAA 6AGATTCRTG AAGCAGTTIG 
TAGTICAGSAA CCAOAAAATG TTCAGQCTGA CCXZAGAGAAT TATTiCCAGCC TTCTTOTTAC 
ATGGGAAAGA CCICGAGTC6 Tn:»iaATAC QVTOATTGAG AA6TITGCAG TTTTGTACCA 
GCAGTXGGAT GGAGAOGACC AAAOCAAGCA TQAATTTTTG ACAGATGOGT ATCAA6RCTT 
QGGTGCTATr CTCAATAATT TGCTAOCCAA TATGAGTTAT GTTCTTCAGA TAGTAGOCAT 
AX6C:A<:rEAAT GGCrCATATO GAAAATACAQ GOACJCAACIG ATTGTmACA TCJCXIXACTGA 
TAATCCXGAA CTTGATCTTT TCCCTGAATT AATTGGAACT GAAGAAATAA TCRAGGAGGA 
GGAAGAGGGA AAAGACATIG AAOAAGOGQC TATTGTGAAT OCTGSTAGAG AC&OIGCTAC 
AAAC3C3AATC AGGAAAAAGG AACCCCAGAT TTdACCACA AiCACACTACA AXGGCATAG6 
GA06AAATAC AAIGAAGC3CA AOACTAACCG ATOCCCAACA AGAGC3A?U3TG AATrCTCTGQ 
AAAGGfflGAT GTTCCCAATA CATCTTTAAA TTOCACTTCC CAACX3VGTCA CTAAATTAGC 
CACAGAAAAA GATATTTCCT TGACTTCTCA GAGTGIGACT GAACTGCCAC CTCACACIXST 
GGAAGGTACr TCAGCCTCTT TAAATGATGG CTCTAAAACT OTTCTTAOA^r CTCCACATAT 
GAACTTGTOS GOGACTGOkG AATCXTTTAAA TAGAGTTTCT ftTAACAGAAT ATGAGGAGGA 
GAGXTTATTG ACCAGTTTCA AGCTTGATAC TGGAGCIGAA GATTCTTCAG GCTOOVSTCC 
C3BCAACTTCT GCTATCCCAT TCATCTCTGA GAACATATCC CAAGGGTATA TATTMCCXC 
OGAAAA0CX3V GAGACAATAA CSaATGAIGT CCTTATACCA GAATCTGCTA GAAATGCTTC 
CGAAQATTCA ACTTCATCAG GTTCAGAAGA ATCACTAAAG GATOCTTCTA TGGAG6GAAA 
TOTGTGGTTT CCZAGCCCXA CAGACATAAC AGCACAGCGC GATGTIGGAT CAGGCAGAGA 
GAGCTETCTG CAGACTAATr ACACTGAGAT ACaTOTTOAT GAATCTQAOA AGACAACCAA 
GTOCTTTTCT GCAGGOOCAG OJGAIGTCACA GGGT0C3CrCA GTTACaSATC TCGAAAnOCC 
ACATTATTCT ACCTTTGCCT ACrTCCC3U\C TGAGGTAAC3L CCTCATGCTT TTACCCCMlTC 
CTC2CAGACAA CAGGATTTGG TCTCCAGGGT CAAOGTGGTA TACTOGCAGA CAACCCAAGC 
GGTATAGAAI- GAGGCCAGXA ATAGTAGOCA X6AGTCTGGX ATXGGrClAG CIGAGGGQZT 
GGAATCO0AO AAGAAGGCAG TTRTAOOCCT TGTGATOGTG TCAGCOCTGA CITTTATCTG 
TCTAGTG6TT CTTGTGGGTA TTCTCATCTA CTGGftGGAAA TGCTTCCAGA CTQCACACTT 
TTACTTAGAG GACAGTACAT CCCCTAGAGT TATATCCACA CCTCCAACAC CTATCTTTOC 
AATXTCAGKT QATGTOGGAa CAATTCCTtAT AAAGCACTTT CCAAAGCATQ TTOCT^GATXT 
ACATGCAAGI AGTGGGTTTA CtGAAGAATT IGAGACACTG AAAGAGTTTT ACCAGGAAGX 
GCAGAGCIQT ACTGTTGACT TAGQTATTAC ASCAOACSaC TCOWQCACX: OIGACAACAA 
6CACAAGAAT CGATACATAA AXATaGTTGC CTATGATCAT AGCAGGGTTA AGCTAGCACA 
GCITGCTGAA AGGATOGCAA ACTGACTOAT TATATCAATG CCAATTATGT TBAnGGCTAC 
AACAGACCAA AAGCITAXAT TGCXGCCGAA GGCCCACTGA AATOCACAGC TGAAGArTTC 
TGGAGAATGA TATGGGAACA TAATQTGGAA GTTATXGXCA TSATAACAAA CCrGOTGGAG 
AAAGGAAGGA GAAAATGTGA TCAGTACXGG GCTGOCGATS GGaGTGAGGA GTACGOGAAC 
TTTCTOGXCA CTCAGA2U3AG TGTGCSIASTG CrTTGCCTAXX ATACTGTGAG GAATTTTACTT 
CTAAGAAACA C&AAAATAAA AAAGGGCXCC CAGAAAGSAA OACCCAOrGG ACGTGTGGTC 
ACACAGTAXC ACTACAOGCA GTGGCCTGAC ATGGGAGXAC CAGAGTACTC CCXGCCAGTa 
CXGACCITTG TGAGAAAGGC AGCCTATGQC AAGCGCCA7G CAGSOGGGCC TGTIGTCGXG 
CACTGC3U3TG CTGGAGTTG6 AAGAACAGGC ACATATAXTG TGCTAGACAG TAIGTTGCAG 
CAGATTCAAC AGGAAGGAAC XGXCAACATA TXTGGCTXCT TAAAACACAT OOGTTCACAA 
AGAAATTATX TGGTACAAAC TGAGGAGCAA TATGTCTTCA XTCATGATAC ACTGGTTGAG 

1260 



60 

120 

IBO 

240 

300 

360 

420 

480 

540 

600 

$60 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

126Q 

1320 

1380 

1440 

isoo 

1560 
1620 

i6eo 

1740 
1800 
1860 
1920 
19BD 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
26B0 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
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GCCATACITA GTAAAGAAAC TGRGGTGCTG GACACTCKTA TTCAXGCCTA TGTTA&TQCA. GfOO 

CrCClCHTTC CTGGACCAOC AAQCTJlGftiGA AACftATTCCA OGOTCTCACr 3660 

CTGTCAOCCA HGCTGGiUSTG CAGAGQCACA ATCTCGGCTC ACTGCftACCT TCCTCTC?CCT 372 3 

GGCTTAACTG ATCCTCXZTAC CTCAGCCTCC CX3AGTGGCTG GGACTATACT CCTQAiOCCAG 37BD 

D TCAAATATAC AGCAiQAGTGA CTATTCTGCA GCCCIAAAGC AATGCAACAG GGAAAAGAAT 3840 

CGAACITCTT CTATC%TCCC T6TOQAAAGA TCAAGGGTTG GCATTTCATC CCTGACTGGA 3900 

6AAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGQCT AXTACCAGAG CAATGAATTC 3960 

ATCRTTACCC AGCACCCTCT CCTTCaTACC ATCAAGGATT TCTGGAGGAT GATATGGGAC 4020 

CATAATGCCC AACTGGTQGT TATGATTCCT GATGGCCAAA ACATGaCAOA AOATGAATTT 4090 

10 GfTTTACTGGC CAAATAAAGA TGAGCCTATA AATTSTGAGA GCTTTAAGGT CACXCTTATG 4140 

GCTGAA5AAC ACAAATGTCI ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 4200 

GAAGCTACAC AGGATGATTA TGTACITQAA GTBRGGCACT OrTCAGTGTCC TAAATGGCXIA 4260 

AATCCAGATA GCCCCATTAG TAAAACTTTT GftACTTATAA GTGTTATAAA AGAAGAAGCT 4320 

GCCAATAGGG ATGGGCCTAT OATTQTTCAT QATOAtSCATG GAGGAfiTGAC GGCRGGAACT 4380 

15 TTCTGT6CTC TGACAACCCT TATGCAOCAA CTAGRAAAAG AAAATTCC3GT GGATGTTTAC 4440 

CAGGTAQCCA AQATGATCAA TCTGATGAGG CCAGGABTCT TTGCTGACAT TGAGC3UGTAT A5QQ 

CAGTTTCrCT ACAAAGXGAT CCTCAGCCTT GTCGGCACAA GGCAQGAAGA GAATCXATCC 4560 

ACCTCTCTGO ACAGTAATQG TGCAGCATT6 CXITGATGGAA ATATAGCTGA GAGCTTi^G 4620 

^ TCTTTASTTT AACACAGAAA GQGGTGGGGG GACTCACATC TGAQCATTGT TTTCCTCTTC 4600 

JAJ CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTG TTGATTXCXX: ATCACCIGAC 4740 

AGTAACrrXC ATGACATAGG ATTCTGOCGC CAAATTTATA TCATTAACAA TQTGTGCCTT 4800 

TTTGGAAOAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATR3AAT TTTACAGTAT 48^0 

TTCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACAQATU^ ATTTC3VATTT 4520 

ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTCTCAA ATTTTTAGCT 4980 

25 GTATTTGTAG CAATTATC^ GTrTCCTAGA AATATAACTT TTAATACAGT AGCCI6IAAA 5040 

lA2lAACiU^C rrCCATATGA XATTCAACAT TITACAACTG CAGTATTCAC CTAAAGTAGA 5100 

AATAATCTGT TACTTATTGT AAATACTGGC CTAGTGTCTC CA'TOGACCAA AirTCATATTT 5160 

ATAATTGXiVS ATTTTTATAT TTTACTACTG AGTCAAOTTT TCTAQTTCra TQTAATTGTT 5220 

TAGTTTAATQ ACGTAQTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 5280 

30 TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATITT AACTTTTGTG GAAAATAQAA 5340 

AXACCTTCAT TTTGAAAGAA GrmrXtOGA GAAXAACACC TTACXIAAACA TTGTTCAAAT 5400 

GQTTTTTATC CAAGGAATIG CAAAAATAAA TATAAATATI GGCATXAAAA AAAAAAAAAA 5460 

AAAAAAAAAA TUUVAAT^AAAA A 54B1 

35 8eq ID NO: ClOl DNA Sequence 

KUcXelc Add Accession #! Eos sequence 
Coding sequences 1..3340 

1 11 21 31 41 51 

40 I 1 I I 1 1 

ATGGGAATCC XAAAGGGTTT CClXaSCTTGC ATTCAiGCTOC TCTGTGTTTG CGGCCTGGAT 60 

TQQGCTAATG GATACTACAG ACAACAGAOA A3UVCTTGTTG AA£3A6ATTGG CTCGGTCCIAT 120 

ACAGGAGCAC TGAATCAAAA AAATTGGGGA AAGAAATATC CAACATGTAA TAGOCCAAAA IfiO 

CAATCTCCTA TCAATATTGA TGAAGATCTT ACRCRAGTAA ATGTGRATCT TAAGAAACTTT 240 

45 AAATITCAdGO GTTGOGATAA AACATCATTG GAAAACACAT TCKTTCATAA CACTGOGAAA 300 

ACAGTGGAAA TTAMCICAC TAATGACTAC GGTGTCAGOS GAGGAGTTTC ASAAATGQTG 360 

TTTAAAGCAA GCAAGATAAC TTTTCACIGG GGAAAATGCA ATA7GTCATC TGATGGATCA 420 

GAGCATAGTT TAGAAGGACA AAAATTTCCA CTTOAQATGC AAATCTACTG CTTIGATGCG 480 

GACOGArrm CAAGTTTTGA GGAAGCAGTC AAAOGAAAAG GGAAGTTAAG AGCTTTATCC 540 
51) ATTITGTTTG AGGTTGGGAC AGAAiSAAAAT TTGGATTTCA AAGCGATTAT TGATGGAGTC 600 
GAAAGTGTTA GTGGrmGG GUUSCAGGCT OCTTTAGATC CATTOlTACr 0TTQAACCTT 660 
CXGCSCAAACr CAACTQACAA GTAITACATT llACAA^TGGCT CATTGACATC TOdCCCTGC 720 
ACAQACACAG TTGACIGGAT TGTTTTTAAA GATACAGTTA GCATCTCTOA AAGCCAGTTG 7 SO 
GCTGirrTTT GTOAAQTTCT TAGRATGCAA GARTCTGGTT ATGTCATGCT GATGGRCTAC 840 
55 TTACAAAACA ATTTTCGAGA GCAACAOTAC AAOTTCICTA GAX2AGGTGXX TTCCICATAC 900 
ACrrUtaAAAGG AAGAGATTCA ^CGAAGCAGTT TG3AGTTCAG AAOCAGAAAA TGTTCAGGCT 960 
6AOCCAGAGA ATTATACCAG CCTTCTTGTT ACATGGGAAA GACCTGOAST OOTTTATGAT 1020 
ACCArrGAXTG AGAAGTTT6C! AGTTXTGTAC CAGCAGTTGG ATGGAGAGGA CCAAACCAAG 1080 
^ CATQAATTTT TOACAGATGG CTATCAAGAC TTGGGTGCTA TTCTCAATAA TITGCTACCG 1140 
OU AATAIGAGTT ATGTTCirCA GATAGTAGCC ATATGCACTA ATGGCITATA TGQAAAATAC 1200 
AGGGnCCAAC TGASTOTOaA CATGCXSTACT GmAAITCCTG AGGGCTkCTAA TAGTA6GCAT 1260 
BAGTCTTCGTA TT6GTCTAGC TGAOGGGTTG GAATCa3Al3A A^UUGGO^GT TATACCSGCTT 1320 
GTGftTCGXGX GAGOCClGAC TTTXATCTGT CTA6TGGTTC TTGTGGGTAT TCTCATCTAC 1380 
TOQAQGAAAT QCTTCCAGAC TDCACACTrT TRCTTAGAGG ACAGTACATC CCCTAGASTT 1440 
05 ATATCCACAC CTCCAACACC TATCITTGCA A7TTCAGATG ATQTCGGAGC AATTCCAACA 1500 
A2U3CACITTC CAAAG^TQT TGCaUSAXTTA CATOCAAGTA GIOOGTITAC TGAAGAATXX 1560 
GAGACACTGA AAGAGTTTTA CCAOGftAGTQ CAGAGCTOTA CTaTTOACIT JUSGTATTACTl 1620 
GCAGACAGCT GCAACCACZCC AGACAACAAG CACAAGAATC GATACATAAA TATOSTTGCC 1680 
— ^ TATGATCATA GCAGGGTTAA GCTAGCACAQ CTTGCTQAAA AGGATGGCAA ACTGACTGAT 1740 
7U TATATCAATG CCAAXTATGT TGATGGCTAC AACAGACCAA AAGCTTATAT 7GC£GCX:CAA IB 00 
OGCGCACTGA AATCX!ACAGC TGAAGATTTC TGGABAATGA TATGG6AACA TAA16TGGAA 1B60 
GTTATIGTCA TGATAACAAA CCTGGIGGAG AAAGGAAGGA GAAAAIGTGA TCAGTACIGG 1920 
GCTGCCGATG GGAGTGAGGA GTACGGGAAC TTTCTGGTCA CTCAGAA6AG TGT6CAA6TG 1980 
_^ CTTGCCTATT ATACTOTOAG GAATTTTACT CTAAGATiACA CAAAAATAAA AAAGQGCTCC 2040 
75 CAGAAAGGAA GACXXA6TOG AGGTGTGGTC ACACAGTATC ACTACACGCA GT6GCCTGAC 2100 
ATOGGAGTAC CAGAGTACXC CX:TGCCAGTG CTGACCXXXG TGAGAAAGGC AGCCIAX6CC 21GO 
AAGGGCCATG CAGTGGGGCX: TGTTGTOGTC CACTGCAGTQ CTGGAGTTGQ AAGAAGAOGC 2220 
ACATATATTG TGCTA6ACAG TATOTTGCAG CAGATTCAAC AOGAAGSAAC TGTCAACATA 22 BO 
^ TTTQGCTTCT TAAAACACAT OOGTTCACAA AGAAATTATT TGOTACAAAC TORQOAGCSrt. 2340 
oU TATGTCrrCA TTCATGATAC ACIGGTTGAQ GGCATACTTA GTAAAGAAAC TGAGGTGCI6 2400 
GACAGXCATA TTCATGCCTA 70TTAATGCA CTGCTCAXXC CTGQACCAGC AGGQU\AAC3l 2460 
AAGCXAGAGA AACAATTCCA GCTOCTGAGC CAOTCAAATA TACAGCAGA6 TS^CTATTCI 2520 
GCAGCCCTAA AQCAATSC3Uk CASGOAAAAG AATOGAACTX CTTCXATCAT CCCTGTGGAA 2580 
AGATCAAGGG TTGGCATTTC ATCCCTGAGX GSRGAAGGCA CSUBkCTACAX CAATQCCTCX: 2640 
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•rATATCATGG GCTATTACCA GAGCAATGAA TTCATCATTA CTCAGCACXX: TCTCCTTCAT 2700 

ACCATC2UU3Q ATTTCrrGOAG GATCSATATGG GACCATAATO CCCAACTGGT GOTTATGATT 2760 

CCTGATGGCC AAAACATGGC AGAAOATXSftA TTTGTTTACT GGCCAAATAA AGATGAGCCT 282 D 

ATAAATTGTQ AGAQCTTTAA GGTCaCTCTT ATGGCTOAAa AACACAAATG TCTATCTAAT 2fl80 

5 G1U3GAAAAAC TTATAATTCA GGACTTTATC TTAGA2MQCTA CACAGGATGA TTATCTACTT 2 940 

GAnSTOAOGC ACTTTCMirG TCCTAAATGG CCAAATCCAG ATAQCCCCAT TAGTAAAACT 3 000 

TTTGAACTTA TAAGTGTTAT AAARGAAGAA QCTGCCAATA GOGATGGGCC TATGATTOTT 3060 

CATGATGAGC ATGGAGOAGT QACGGCAG(3A ACTTTCTGTG CTCTGACAAC CXTTTATGCAC 3120 

CAACXAGAAA AAfiAAAATTC CGTC3GATGTT TACCAOOTAO CCAAGATGAT CAATCTGATQ 3180 

10 T^GOCAGGAG TCTTTGCTQA CATTGAGCAQ TATCAGTTTC TCTACAAAQT GATCCTCAGC 3240 

CTTGTGAGCA dUV^CAGGA AGAOAATCCA OCTAOCrCTC TGGACAGIAA TQGTGCftGCA 3300 

TTGCCTGATG GAAATATAGC TGAGM3CTTA GAGTCTTTAG 3340 

. Seq ID NO: C1Q2 lifNA Sequence 

15 Nucleic Acid Accession #: Eob sequence 
Coding sequences I.. 4480 

1 11 21 31 41 51 

on > ' ^ ' ' ' 

2\J ATGOGAATCC TAAAGCGTTT CXTTCGCTTGC ATTCAGCTCC TCTGTGTTTG CCGCCTGGAT 60 

TGGGCTAATO GATACTACAG AChAGAQAGA AAACI7GTTO AAGAGAT1X3G CTGGTCCTAT 120 

ACAGGAGCAC TGAATCAAAA AAATTGGGGA AASAAATATC CAACATGTAA TAGCCCAAAA 180 

CAATCTCCTA TCAATATTGA TOAAQATCTT ACACAAGTAA ATGltSAATCT TAABAAACTT 240 

AAATTTCAiGO GTTGGGATAA AACATCATTQ GAAAACACAT TCATTCATAA CACTGGGAAA 300 

25 ACAGTG6AAA TTAATCTCAC TAATGACTAC CGTGTCAGCG GAGGAGTTTC AGAAATGGTG 360 

TTTAAAGCAA GC3UVGATAAC TTTTCACTGe GGAAAATGCA ATATGTCATC TGATGGATCA 420 

GAGCATAGTT TAOAAGOACA AAAATTTOCA CTTGAGATGC AAATCTACTG CXrXGATGCA 480 

GACOGATTTT CAAGTTTTGA QOAAGCAGTC AAAGGAAAAG GGAAGTTAAG AGCTTTATCX: 540 

^ ATTTTGTTTQ AGGTTOGGAC AGAAGAAAAT TTGGATTTCA AAGOSATTAT TGATGGAGTC 600 

30 GAAAGTG^^TA GTCGTTTTGG QAAQCAfiGCT GCTTTAGATC CATTCATACT GTTQAACCTT 660 

CrrODCAAACT CAACIGACAA GTATTACATT TACAATGGTT (ZATTGftCATC TCXITCOCTGC 720 

ACAGACACAG TTGACrOGAT TGTTTTTAAA GATACRGPITA GCATCTCTSA AAGCCAGTTG 780 

GCTGTTTTTT 6TGAAGTTCT TACAATGCAA CAMCTGGTT ATCHTCATQCT GATGGACTAC S40 

TTACAAAACA ATTTTCGAGA GCRACAGTAC AAGTTCTCTA GACAGGTGTT TTCCTCATAC 900 

35 ACTGOAAAGG AAGAGATTCA TGAAGCAQTT TGTAGTTCAG AACXMAAAA TGXTCAGGCT 960 

GACGCAGAGA ATTATACCAG CXSTCllGXT ACAT3QGAAA GACCTOGAGT COTTTATQAT 1020 

ACOUTIGATTB AGAAGTrTGC AGTTTTQTAC CAGCAGTTGG ATGOAGAGGA CCAAACCAAG 1080 

QiTGAATTTT TGACAGATGG CTATCAAGAC TrQGGTGCTA TTCTCAATAA TTTQCXACCC 1140 

AATATQAGTT ATGTTCTTCA GATAGTAGCC ATATGCACTA ATQGCTTATA TGGAAAATAC 120O 

40 AGCGAGCAAC TGATIGT03A CAlGCCTACT GATAATCCXG AACTTGATCT TTTCCCIGAA 1260 

TTAAZTGOAA CXGAAGAAAT AATCAAGGAG GAGGAAGAGG GAAAAGACAT TGAAGAAGGC 1320 

GCTATTGTGA ATCCTGGTAG AGACAGTGCT ACAAACCRAA TCAGGAAAAA GGAACCCCAG 1380 

ATTTCTACCA CAACACACTA CAATCGCATA GGGAOGAAAT ACAATGAAGC CAAGACTAAC 1440 

0GATC30CCAA CAAiGAGQAAG TGAATTCTCT GGAAAQGGTG ATGTTCCCAA TACATCTTTA 1500 

4-5 aattcxj«:tt coiaaccagi cactaaatta gccacagaaa aagatatttc cttgacttct iseo 

C?UGACTGTGA CTGAACTGCC ACCTCACACT GTGGAAJ3GTA CTTCAGCCTC TTTAAATGAT 1620 

GGCTCTAAAA crGTTCTTAG ATCTOCACAT ATGAACTTGT CGGGGACTGC AGAATCCTTA 16B0 

AATACRfiTTT CTATAACAGA ATAT6AQGAG GAGAGTTTAT TGACCAGTTT CAAQC3TGAT 1740 

ACTGGAGCTG AAGATTCXTC AGGCTCC3«3r CCGGCRACTT CTGCTATCOC ATTCATCTCT IBOO 

50 GAGRACATAT CCCAAGGQTA TATATTTTCC TCCOAAAACC CAGAGACAAT AACATATGAT 1860 

GTGCTIATAC CAGAATCTGC TAGAAATOCT TOOSAAGATT CAACTTCATC AOGTTCAGAA 1920 

GAATCACTAA AGGATOCTTC TATOGAiGGGA AATGTGTGGT TTOCTAGCTC TACAGACATA 1980 

ACSGCACAGC CGGATGTTGG ATCAOGC!AGA GAGAGCTTTC TCCAGACTAA TTACACTGAQ 2040 

ATAOGTGTTG AIGAAXCTGA GAAGACAACX: AAGTCCTTTT CTGCAflGOCC AGTGA'XGTCA 2100 

55 CAGGSTCCCT CAGTTACAGA TCTGGAAATG CCACATTATT CTACCTTTGC CTACITCXICA 2160 

ACTGA&GTAA CAjCCICATGC TTTTATCXX3^ TCCICCAGAC AACAGGATTT GGTCTCCAOG 2220 

GTCAACGTG6 TATACTOSCA GACAAOCCAA O0GGTA7ACA ATGAGGCX^AG TAATAGTAGC 2280 

CATQAOTCTC GTATTOGTCT AGCTGAGOGG T3?GGAATCC3G AOAAGAAGGC AGTTMACOC 2340 

CTTGTGATCG TGTCAGCCCT GACmrTTATC TOTCEAGTSG TTCTTGTG G Q TATTCTCKTC 2400 

OO TACTGGAGGA AATGCTTCCA GAiCTGCawCSUC TTTTACTTAG AGGACAGTAC ATCOOCTAQA 2460 

GTTATATCCA CACCTCCAAC AOCTATCTTT CCAATTTCAG ATGATQTCSGG AGCAATIOCA 2520 

ATAAAGCACT TTOCAAAGCT. TOTTGCftGAT TTACATGCAA GTAGTQSGTT TACTGAAGAA 2SS0 

TTTOAGUUIAC TGAAAGAGTT TTACCSGGAA GTGCAGAGCT aTACTOTTGA CXTAGGTAIT 2640 

ACAGCAGACA 6CTCCAACCA CCCSUSACAAC AAGCACAAGA ATOGATACAT AAATATCSGTT 2700 

65 GQCCA3GATC ATAGC3U3GGX TAAGCTAGCA GAGCTTGlTrG AAAAGGAIGG CAAACTGACT 2760 

GATTA^TATCA ATGCX^iATTA TGTTGATGGC TACAACAGAC CAAAAGCTTA TATTGCTGOC 2820 

CAAGGOCCAC TGAAATCCAC AGCT6AAGAT TTCTGGAGRA TGATAOXSGGA ACATAATGTQ 2880 
GAAGTTATTG TCATOATAAC AAACCTOGTG GAGAAAGGAA GGAGAAAATG TGArCAGTAC 2940 
TGGCCTGCCG AIGGGAGTGA GGAGTACGGG AACTTICTGQ TGACTCAGAA GAGIXSTGCAA 3000 
70 GIGCTTGCCT ATTATACTQT GAGGAATTTT ACTCEAAGAA ACAGAAAAAT AAAAAAGGGC 3060 
TCCXaCAAAG QAAGi^CAG TQGRCGTGTG GTCACACAGT ATCACTACAC GCMTOGCCT 3120 
GACATGG6AG TACCAQAQTA CTCCCTGCX3V GTGCTGAOCT TTGTGAGAAA GGCAGCCTAT 3180 
GCCAAGCGCX: ATGCAGTGG3 GCClGTTOTC GXOCACTGCA GTGCIG6AGX TGGAAGAACA 3240 
GGCAGATATA TTGTGCTA6A CAjGTATGTTG CAGCaMSATTC AACAOGAAGO AACTOTOU^ 33D0 
75 AXATTTGGCT TCTTAAAACA CATCC3GTTCA CAAAGAAATT ATTTGGTACA AACIGAGGAG 3360 
CAATATGTCT TCATTCATGA TACACTGGTT GAGaCCA*EAjC TTAGTAAAGA AACTGAGGlG 3420 
CTGGACAGTC ATATTCATGC CTATG7XAAT GCACTCCTCA TTCCIGGACC AGCAGGCAAA 34B0 
ACAAAGCTAO AGAAACAATT CCAGGOTCTC ACTCIGTCAC OCAGGCTGOA GTGCAGAGGC 3S40 
^ ACAATCTOQG CTCACTGCAA CXZITGCICVC GCIGGCTTAA CTGKrCCTCC TAOCTCAGCXZ 3600 
80 TQCOGAGTGG CTAGQACTAT ACTCCTGAGC CAGTCAAATA TACAGCAGAG TGACXATTCT 3660 
GCTUSCCCTAA AGCAATGCAA CAGGOAAAAG AATGGAACTT CTTCTATCAT CCCTGTGGAA 3720 
AGATCAAGOS TTGGC!ATTTC ATCOCTGAGT GGAGAAGGCA CAOACTACAT CAATGCXTTCC 3780 
XAXATCATGG GCTATTACCA GAGCAATGAA TTCATCATTA OCCAGCACOC TCTCCTTCAT 3840 
ACCATCAAGQ ATTTCTGGA6 GATOATATGG GACCATAATG OCCAACTGGT GGTTATGATT 3900 
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OCTGATGGCC 
ATAAAnTGTG 
GAGGIAAAAAC 
GAAGTGAGGC 
TTTGAACTTA 
C?VTGA.T6A6C 
CIVACTAGAAA 
AGG0CA60AG 
CTTGTGAGCA 
TTGCCTGAT6 



AAAACATGGC 
A6AGCTTTAA 

TTATAATTC& 
ACTTTCAQTG 
TAAfiTGTTAT 
ATGGAGOAGT 
AAGAAAATTC 
TCTTTGCTQA 
CAAG6CAGGA 
GAAATATAGC 



AGAAQATGAA 
C3GTCACTCTT 
GGACTTTATC 
TCCTAAATGG 
AAAAGAAGAA 
GAOGGCAGGA 
OGTGGATGTT 
CATTGAGCAG 
AGABAATCCA 
TGAGAGCTTA 



TTT6TTTACT 
ATGQCTGAAS 
TTAG?UiGCTA 
CCAJUiTCCAG 
□CTGCCAATA 
ACTrrCTGTG 
TACCAGGTAG 
TATCAGTTTC 
TCCACXTCTC 
GAGTCTTTAG 



GOCCAAATAA 
AACACAAATG 
CACAGGATOA 
ATAGCCCCAT 

GGGAToaacc 

CTCTGACAAC 
OCAABATGAT 
TCTACAAAGX 
TGGACAGTAA 



AGAT6AGCCT 
TCTRTCTAAT 
TXATGTACTT 
TAQTAAAACr 
TATGATTOTT 
CCTTATGCAC 
C3\ATCrGATG 
6ATCCTCAGC 
TGGTGCAGCA 



3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4400 



Seq ID NOs C103 znUA Sequence 

NUcIelc Acid Accession if: Bos sequence 

Coding sequences 2... 4220 

1 11 21 31 4X 51 

I [ I I I I 

ATGGGAATCC TAAAGCQTTT CCTOGCTTGC ATTCAGCTCC TCTGTGTTTG CCXSCCrGQAT 60 

TGiaaCTAATG GATACTACAG ACAACAGAGA AAACTTGXTG AAC3AGATTQG CTQGTCCTAT 120 

ACAGGAfiCAC TGAA7CAAAA AAATTGGGOA AAGAAATATC CATVCAIGTAA TAGCCCAAAA 180 

CAATCTCCTA TCAATATTGA TOAaGAlCTT ACACAAGTAA AT6TGAATCT TAAGAAACTT 24D 

AAATTTCAGG CTTOGOATAA AACATCATTG GAAAACACAT TCATTCftTAA CACTGGORAA 300 

ACAGTGOAAA TTAATCTCAC TAATGACTAC CSGTGTCAGOG GAGGAQTTTC AGAAATGGTG 360 

TTTAAAGCAA GCAAQATAAC TTTTCACTGG GGAAAATGCA AXATGTCATC TQATGGATCA 420 

QAJHCATAGTT TAGAAGGACA AAAATTTCCA CTTGAGATGC AAATCTACTG CTTTGATGCG 480 

6ACX:GA'XTTT CAAQTTTTGA GGAASCAGTC AAAGGAAAAG GG3\A8TIAAa AGCITTATCC 540 

ATTTXC5TTTS AGGTTaaOAiC 2U5AAGAAAAT TTGGATTTCA AAGCGATtAT TGATGGABTC fiOD 

GAAAGTGTTA GTCGTTTTGG QAAGCAGGCT GCTTTAGATC CATTCATACT GTTGAACCTT 660 

CIGCCAAACT CAACTQACAA GTATTACATT TACAATOGCT CATTGACATC TGCTCCCTGC 720 

ACAOACACAG VrGACTGOAT TGTTTTTAAA GATACAGITA GCATCTCTGA AAGOCAGTTG 7fl0 

GCTGTTTTTT OXOAAlGXTCT TACAAXGCAA CRATCTGGTT A3GTCATGCT 6ATGGACTAC 840 

TTACAAAACA ATTTTCaAGA GCAACAGTAC AAGTTCTCTA OACAGSTGrT TTCCTCATAC 900 

ACTOGAAAGG AAGAOATTCA TGAA0CAGTT TGTAGTTCAG AAOCAGAAAA TQTTCAGGCT 960 

GACCCAGAGA ATTATACCAQ CCTTCTTGTT ACATGGGAAA GAOCTCGAGT OGTTTATGAT 1020 

ACCATGATTG AGAAGITTGC AGTTTTQTAC CaGCAGTTGG ATOGAGAGGA CCAAAGCAAG 1080 

CATGAATTTT TGACAGATGG CXKTCAAGAC TTGGGIGCTA TTCTCftATAA XTTGCTACCC 1140 

AATATGAGTT ATGTTCTTCA GATAOTAGCC ATATGCACTA ATGGCXTATA TOQAAAATAC 1200 

AGCQACCAAC 1K5ATTGTCBA CAraZCTACT t3ATAATCX:TG AACTTSATCr TTTCCXTTGAA 12«0 

TTAATTGGAA CTOAAGAAAT AATCAAGGAG GAGGAAGAQG GAAAAGAC3VT TOAAEAAGGC 1320 

GCTATTGTGA ATCCTGGTAG AGACAGTCCT AC3kAACCAAA TCAGGAAAAA GGAACCOCAG 1380 

ATTTCTACCA CAACACAGTA CAATQQCATA GGGAGQAAAT ACAATGAAGC CAAGACIAAC 1440 

QSATCCCCAA CAAGAGGAAS TGAAITCTCT GGIAAAGGGTG ATGTTCCCAA TACATCTTrA 1500 

AATTOCACTT OCCAACX^AGT CACTAAATTA GCCACAGft2\A A2UjATATTTC CTTGACTTCT 15G0 

CAGACIGTGA CTGAACTGCC ACCTCACACT GTGGftAGGTA CTTC3WQCCTC TTTAAATGAT 1620 

GGCtCTAAAA CTOTTCTTR6 ATCTCCACAT ATGAACTTGT C3GGGGACTGC AGAATCCTTA 1S80 

AATACASITT CTATAAGAiGA ATATOAGGAG QAGZ^IXAT TGACCAOTTT CAAGCTIGAT 1740 

ACTQGAGCrO AftaA' il 'C L ' T C TbGGCTCCAGT CX:OGCAACTT CiGCTATOCC ATTCATCTCT IBOO 

GAQAACATAX OOCAAOGGTA TATATTTTCC TCCSSMMCC CAGAGACAAT AACATATGAT 1S60 

GTCCTTATAC CJWSiATCrGC TAGAAATGCT TCCGAA£3ATT CAACTTCATC AOGTTCA6AA 1920 

GAATCACIAA AGGATCCTTC TATGGAG6GA AArGTGTGGT TTCCTAGCTC TACAGACATA 1980 

ACAGCACAGC OCGAX6TTQG ATCAQGCAGA GAGAQCTTTC TCCAGACTAA TIACACXGAG 2040 

ATA0GTGTT6 ATGM.TCTGA GAAGACAACC AAGTCCTTTr CTGCROSGCC AGTGATEQTCA 2100 

CAGGGTCCCr CAGTIACAGA TCIGGATU^TG CC3U31TTATT CTACCTTTGC CTACTTGGCA 2160 

ACT6AGGTAA CACCTCATGC TTTTACCXX^L TCXTTCCAGAC AACAGGATTT GQTCTCCACG 2220 

GTCAAOSTGG TATACTOGCA GACAACCCAA C3GGGTATACA ATGAGGCCAG TAATAGTAOC 22 BO 
CATGA(3TCTC GTATXGSrCT aGCXaAGGQa TTGGAATCXDG AGAAOAAGGC AGTTATACCC 2340 
CTTGTGATG6 TGTCAGCCCT GSACTTTTATC TQTCHiG^CGG TTCTTGTGBG TATTCTCATC 2400 
TACTGGAGGA AATGCTTCCA GACTGCACAC TTTTACTTAQ AGGACAGTAC ATCCOCIAGA 2460 
GTTATATOCA CACCTCCAAC AOCTATCTTT CCAATTTCAG ATOATQTOGG AGCAATTCCA 2S20 
ATAAAGCACr TTCCAAAGCA TGTTGCftGAT TTAjCATGCAA 6TAGTGGQTT TACTGftAtaA 2580 
TTTGAGACAC TOAAAGMSrS TTACC3U»aAA GIGCAGAGCT STACXGTTGA CTTAGGTATT 2640 
ACAGCAGACA GCTCCMCCA OSCtUSACAAC AAGCACAAGA ATCGATACAT AAATATOGTT 2700 
GCCTATQATC ATAGCAGG6T TAAflCEAGCA CAGCTTOCIG AAAAGGATGQ CRAACTGACT 2760 
GATTATAXCA ATGOTiATTA TGTTGATGGC TACAACAGAC CRAAAGCXTA TATTGCTGCC 2820 
CAAGGOCKM TGAAATCCAC AGCTGAAGAT TTCTG6AGAA TGATATGGQA ACATAATGTG 2880 
GAAGITATTG TGATGATAAC AAACCTOOTG GAGAAA9GAA GGAOAAAATG TGATCAGTAC 2940 
TGGOCTGCGB ATGGGmGTGA QGAlSTAiCGGG AAqTTTCTGG TCACTCAGAA GAGTGTGCAA 3000 
GTGCTTGCCT ATTATACTGT GAGGAATTTT ACTCTAAGftA ACACAAAAAT AAAAAAGGGC 3060 
TCCCAGAAAG GAAGACCC&G TGGAOCtTGTG GTCACACAGT ATCACTACAC GCAGTGGCCT 3120 
GACA10GGAG TAlOCAGAGTA CTCCCIGCCA GTGCTG2V0CT TTGTGAGAAA GGCAGGCTAT 3100 
GOCAAGOGCiC AlGCftGIGGG GCCXaTTGTC GICCACTGCA GTGCIGGAGT TGGAAGAACA 3240 
GGCACAXATA TK3TGCCAQA C2)G!AIGTIO CAGCAGATTC AACAGGAAGG AACTGTCAAC 3300 
ATATTTQGCT TCTTAAAACA CATCCDTTCA CAAAGAAATT ATTTGGTACA AACTOAGGAG 3360 
CAATATCTCT TCATTCATGA TACACTGGTT QAGGCCATAC TTAOTAAAGA AACTGAGGTO 3420 
CTGOACRiSrC ATATTCATGC CTATOTTAAT GCACTGCTCA TTOCTGGACC AGCAQGCAAA 3480 
ACAAAGCEAG AGAAACAAXT CCAGCTCCTa AOCCAGTCAA ATATACAfiCA GAGTGACTAT 3540 
TCTGCaWSOOC TAAAGCAATO CAACAGOGAA AflGAATC3GAA CTTCTTCTAT CATCOCTGTG 3600 
GAAAGATCAA OOGTTGGCSVr TTCATCCCTQ AGTGGAOAAG GCACAGACTA CATCAATGCC 3660 
TCCTATATCA TGGGCTATTA CCAGAGCAAT GAATTCATCA TTACCCRGCA CCCTCXOCTT 372 O 
CATACCATCA AGGATTTCTG GAGQATQATA TGGGACCATA ATGCCCAACT QGTGaTTATG 3780 
ATTCCTGATG GOCAAAACAT GGCAGAAGAT QAATTTGTTT ACTGGCCAAA TAAAGATGAG 3840 
GCTATAAATT CJTGAGAGCTT TAAGOTCACT CXTATGGCTG AAGRACAOU^ ATGTCTATCT 3900 
AATGAGGAAA AACTTATAAT TCAGGACTTT ATCTTAOAAG CTACACftGGC ATGGAGGAQT 3960 
GAGGGCAGGA ACTTTCTGTG CTCTGACAAC CCTTAXGCAC CAACTAGAAA AAGAAAATTC 4020 
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PCTrtIS02/36810 



10 
15 
20 
25 



CGTGC3A.TGTT TACCAGGTAfi CCSVAOATQAT CAATCTGATG AGGCCAGGAG TCTTTGCIGA 4080 

CATTGAGCAG TATCAGTTTC TCTAOUMGT 6ATCCTCAGC CTrOTOAOCA CAAGQCAGGA 4140 

AGA6AATCCA TCCACCTCTC TGGACASEAA TGGTGCAGCA TTGCCTGATG GAAATATAGC 4200 

TOAGAGCXTA CSAGTCXTTAG 4220 

Seq ID fios C104 DHA Sequence 
IiltlCleic Acid AcCesBion #: XM_DQ2914.6 
Coding sequence s 1 • . 4314 

1 11 21 31 41 51 

I I I I I I 

ATGAAlGGATA TCSACATABG AAAASAGTAT ATCATCCCCA GTCCTGGGTA lAiGAAQTOTIS 60 

A(3GGAGAGAA CCRC3C31CTTC TGGGAOGCAC AGAGACC6U:G ABCATTCCAA GTTCAGGAGA 120 

ACTCGACXZGT TGGAATGCCA ACATGCCTTG GAAACAQCAG CXXGAGCCX5A GGGCCTCTCT IBO 

CTTGATGOCT CCATGCATTC TCAGCTCAGA ATCCTGGATG AGGAGCATOC C3U«3GGAAAG 240 

TAOCATCATQ QCTTGAGTGC TCTGAAGCCC ATCCGQACXA CTTCCAAACA CCAGCACCCA 300 

GTOGACAATG CtGGGCrTTT TTCCW3TATG ACTTTTTCGT QGCTTTCTTC TCTGG0C3CGT 360 

GTGGCCCACA AGAAGGGGGA GCTCrCAAi:G GAAGA03TGT GGTCTCTGTC CAAGCAGQAG 420 

TCTTCTGACG TGAACTGCRQ AA<?ACTAGAG AGACTGTGGC AAGiUiGAGCT OAATGAAGTT 480 

GGGOCAfSAOa CTGCTTCOCT GCOAAGGOTT GTGTOQATCT TCTGCOGCAC CAjGGCTCATC 54.0 

CTSTCCATOG TCTGCCTQAT GATCAGGCftG CXGGCTGGCT TCRfiTGGACC AGC3CTTCATO €00 

GTGAAACAX^C TCTTGGAGTA TACCCAGGCA. ACAGACTCTA ACCTGCAGTA CAGCTTGTT6 660 

TTAQTOCTGG GCCTCCTCCT GACGQAAATC GTGOGGTCTT GGTOGCTTGC ACTGACTTQG 720 

GCATTGAATT ACCOAACCGG TGTCCGCTTG OOGGGGGCCA TCCTAACCAT GGCAtrTAAQ 780 

AAGATQCTTA AGTTAAAGAA CAXTAAAGAB AAATCCCTGG GTGAGCTCAT CAACATTTGC 840 

T OCAAO GATS GGCAGSUIAAT GTTTGAGGCA GCAGCOGTllG GCRaCX:TQCT GGCTGGRGGA 900 

COCGTTGTTG CCATCTTAGG CATGATTTAT AATGTAATTA TTCTGGGAOC AACAGGCTTC 960 

CTOOGATCAG CTGTTTTTAT CCTCTTTTAC CCAGCAATGA TGTTTGCATC AOGOCTCACA 1020 

GCAT AT1TCA QOACaAAAATG OGTGGOC6CC AOSSATGAAjC QTOTCCAGAA GATGAATGAA 1080 

J\J GTTCTTACTT ACATTAAATT TATCAAAATG TATGCXZTGGG TCAAAGOiTT TTCTCftGAGT 1140 

GTTCAAAAAA. TCCSCOAGGA GGAGOSTGGG AXKTTGGAAA AAGCIGGGTA CTTCXIAGAGC 1200 

ATCACTGTGQ GTGTGGCTCC CATTGTGGTS GTOATTGCCA GCGTGGTGAC CTTCTCTGTT 1260 

CATAKSACCX: TQGQCTTCGA TCTGACAGCA GCACAGGCTT TCaCAGTQGT GACAGTCTTC 1320 

AATTCCATGA CTTTTGCTTT GAAAQTAACA OCGnTTTCAG TAAAGTCCCT CTCAGAAGCC 13 BO 

J-y TCAGTGGCTG TTOACaWSATT TAAGAGTTTG TTTCTAATGG AAOAQGTTCA CATGATAAAQ 1440 

AACAAAiOCAG CCftSrCCTCA CATCAASAIA GSUaATQAAAA ATGCCAGCTT GGCAIGGGAC 1500 

TCCTGOCACr CCASTATCCA GAACTOOOCC AAGCT6AC0C CCAAAATQAA AAAAGACMUS 1560 

AGGGCTTCCA GGGGCAAGAA AGAGAASGTO AOGCAGCTrGC AGOGCACTGA GCATCAGGCXS 1620 

OTGCTGGCAG AGCAOAAAGG CXS^CCTCCTC CIGGACAG-TG Aa3AaCC3t3CC CAGrCCCGAA 16S0 

GAGGnAQAAG GCAAGCACAT CC!ACC=rOGGC CAOCTGCIGCT TACASAGGAC AiCZGCACASC 1740 

ATCGATCTGG AOAITCCftASA GGGTAAACTG GirGGAATCT GG(3GC!Ai3TGT GGGAAGIGGA 16QO 

AAAACCTCrC TCATTTC&fiC CATTTTAGQC CAGATQACGC TTCIAGAGGG CAi8C3^TTGCA 1860 

ATCAGTOGAA CXOTOSCTTA TGTGGOCCAS CftGQOClOGA TCCTGAArrOC TACTCIGAGA 1920 

GACAACATCX: T6TTTGGGAA GGAATAIOAT GAAGAAAOAJT ACAACTCTGI GCTGAACAGC 1980 

TGCTGCCTGA GGCCTGAOCT GGCCATTCTT OCCAGO^CQ ACCT6AGG6A GA3:TGGAGA0 2040 

CGAGOAQCCA ACdtSAGCOa TOaaCAGCaC <?bGAGGATCA GCCTTGCCa} GGCCTTGTAT 2100 

AGTGACAdSGA 6CATCTACAT CCTGGACSM: GOCCTCAGIG QCITAGATQC CCATGTGGGC 2160 

AACGACATCrr TCAATAfiTGC TATCaCGG&AA CATCTCAAGT CCAAGACRGT TCTGTTTCTT 2220 

ACCCAOCAGT TACA0TACCT GGTTGACTGT GATGAAGTGA TCTTCATGAA AGAGQGCTGT 22 S a 

ATTACX3GAAA GAGGCAGGCA TGAGGAACTG ATOAASTTAA ATGGTGACTA TGCTAOCaTT 2340 

TTTAATAACC TGTTGCTGGG AfSAGACAGCG OCAGTTGI^ TCAATTCAAA AAAGGAAAiCC 2400 

AGTSSTTCAC AGAAGAAGTC ACAAGACAAG GGXCCTAAAA CAGGATCnST AAAOAAGQAA 2460 

AAftSCAOTAA AGOCAGfiflOA AQGGCAGCTT GTOCAGCrGG AAGAAJUiaaQ QCAGGGTTCA. 2520 

tzc GTGC CCTGGT CAOTATATGG TGTCTACATC CAGGCTGCTQ GGGQCCCXTrT GGCATTGCTG 2580 

GTTATTATQG OOCTTTTCAT GCTOAATGTA GQCAGCACOG CCTICAGC:aC CTGOTGGTTC 2640 

AGTTACTG6A TCAAGGAAOG AAGOGGGAAC AOCACIGXGA CTOQAflOGAA GGAGACCTGG 2700 

GTGAGTOACA GCAT6AAGGA CMTCCICAT AKGCAGXACT AIXSOCASCAT CTAOGCOCTC 2760 

TCCATGGCAG TCATGCTQAT CCTGAAAGCC ATT06AGGAG TTQ T CTTTGT CAAGGGCACQ 2B20 

CTGCGAGCTT CCTCCCGGCT GC3^*n3AOQA)S CTTTTCCaAA GGATOCTTCG AAGCCCTATG 2880 

AAOTTTTTTG ACACGACCCC CACAQGQRGG ATTCTCAACA GGITTTCCAA ASACATGGAT 2»40 

GAAGTTGAOG IXSGGQCTGCX: GTT0CA66CC GAGATGTTCA TCCAGAAGGT TATCCTQQTB 30QQ 

TTCTTCTOTG TGGGAATGAT CGOlGGRGTC TTCCOSTGGT TCCTTGrcOGC AOTOGGGCOC 3060 

CTTGTCATCC TCTTTTCAGT CX3PGCACATT GTCTCCAGGG TC5CTGATTCG OGAGCTGAA6 3120 

OGTCIGGACA ATATCACGCA 6TCAC3CTTXC CTCTCOCACA TCAOGTtTAG CATACAOGGC 3180 

CTTGCCA<XA TCCACGCCTA CAATAAAiGGG C^^GGAGITTC tCGCACAGATA OCAGGAGCIG 3240 

CTGGATGACA AC3CAAGCPCC TTTTTTTTTG TTTACGTGTO CGATGCGGTG GCXGaCTGTC 3300 

CGOCTaOACC TCATCAfiCAT CXaXXTCCRTC AOCRCCftOGe QGCTGRTQAT CCTTCTTftTG 3360 

CACGGGCAGA TTCCCCCASC CTATG0C5GGT CTCOCCATCT CTTATGCTGT CCAGTTAACG 3420 

OQGCTaTTCX: AGrrTTACGGX CAGACTGGCA TCTGAGACAG AAGCTCGATT CACCTCGGTG 3480 

GAGAGGAICA ATCACTACAT TAAGACTCTG ITCCTTGGAAG CAOCTGCCAG AATTAAGAAC 3540 

AAjQGCTGCCT CCCCTGACTQ GCCCCAOOAG GGAGAGGTGA CCTTTGAGAA CGCAGMsATQ 3600 

AGGTACOGAG AAAACCTGCC TCTOGTOCTA AMSAAAGTAT CCTTCACX3AT CAAACCTAAA 3660 

6AGAAGATTO GCATTGTQOO GGGGACAGaA TCAGGGAAGT CCTa3CTQt3G GATQGCCClX: 3720 

TTC0G7CTG6 TGGAGTTATC TGGftGGCTGC ATCSVAOATTG ATOGAGTGAG AATCAGTGAT 37B0 

ATTGSCdra CXrOACCTCaS AAGCAAACTC TCTATCATTC CTCAAGAGCC GGTGCIGTTC 3840 

AGTGGCACTG TCAGATOW^A TTTGGACOCC TTGAiVCCAGT ACACTGAAGA CGAGATTTG6 3900 

GATGCCCTGG A(^GGACACA CATGAAAGAA TGTATTGCTC AGCTACCTCT GAAACrXGAA 3960 

TCTGAAGTGA TGGAGAATGG GGATAACTTC TCRQTGGGGG AAGGGCAGCT CTTQTQCftTA 4020 

GCTAOAGCCC TGCTCCGCCA CTGTAAGATT CTGATTTTAG ATGAAfiCCAC AGCTGCX3WK3 4080 

oU GACACAGAGA CAGACTTATT GATTCAAGAG ACCATCOGAG AAGCATTTGC AGAClGTACC 4140 

AT6CTSACCA TTGCCCATCX5 CCTGGACACG tyPTCTAjOGCT CCGATAGGAT TATGGIGCTG 4200 

GCCCAGGGAC AGGTOOTGGA GnTOACACC CCATCGGTCC TTCiaTCCAA OGAOtfSlTCC 4260 

CGArrCTATG CCATTGTTTaC TGCTTGChfiAG AACAAGGTOG CTGTCAAGGQ CTAG 4314 
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Seq ID NO! C105 IIHA Sequence 

Nucleic Acid AcceBsion #x JIK_0D5688.1 

coding sequence: 12 6,. 44 3 9 

1 XI 21 31 41 51 

t I 1 1 I I 

CGOGGCAGGT 0GCTCATGCT CGGORGCGTG OTTGAfiCGGC TGGCGCGGTT OTCCTGGAGC 60 

AQGGGCGCAa QAATTCTGAT OTGAAACTAA CftGTCTGTGA GCCCTGOAAC CTCCGCTC3VG 120 

AGAAGAtCGAA GGATATOGAC ATAGOTIAAAG AGTATATCAT CCCCAGTCCT GGGTATAQAA IQO 

GTGTGAGQC3A OAOAACCAGC ACTTCTG£3GA C(5CACAGAGA CCGTGAAOAT TCCftAGTTCA 240 

OGAGAACrCG AOCOTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCSA GCC<?ABGGCC 30O 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGRATCCT GGATGAOQAE CATCCK3^AOG 3€0 

GAAA&TACCA TCATGGCTTG AGTGCTCTGA AGCCCATCX33 GACTACTTOC AAACRCCAGC 420 

AC!C!CAGT(3(3A CAATGCTGQG CTTTTTTCCT GTATGACm: TTCBTGQCTT TCTTCTCIGG 480 

CCC3GTGTGGC CCACAAGAAO GOOGBGCTCT CAATGGAAQA CGTGTGGTCT CTGTCCAAGC 540 

ACX3AGTCITC TGAGGTGAAC TGCAGAAGAC TAOAGAGACT GTGGCAAQAA GAiGCTGAAUG 600 

AAGTTYSGGCC AGACGCTGCT TCCCTGOGAA GGGTTOTGTG GATCTTCTGC CGCACCAOQC 660 

TCATC3CTOTC CATCGTGTGC CTGATGATCR OGCAGCTGGC TGGCTTCAGT GGACCAGCCT 720 

TCATGGTGAA ACACXTCTTG GAGTATACCC AGGCAAC3USA GTCTAAOCTG CABTAEaOCT 780 

TGTTGTTAGT GCTGGGCCTC CTCCTGAC9GG AAATCX3TG0G CTCTTGGTCa CTTGCACnGA 840 

CTTGQGCATZ GAATTACCGA AGCOGTGTCC GCTTQCQGOG GGCCATGCTA ACCATGGCAT 900 

TTAAGAACSAT CCTTAAGTTA AAGAACATTA AAGAGAAATC CJCTQQOTSAG CTCATCAACA 960 

TTTGCTCCAA CGATGGGCAO AOAATGITTG AGGCRGCAGC OGTTGGCAGC CTGCTOOCTG 1020 

GAGQACCCQT TGTTQOCATC TTAGGCATGA TTTATAATGT AATTATTCTG G6A0CAACAG 1030 

GCTTCCTGGG ATCAaCTQTT TTTATOCTCT TTTACCCAGC AATGATGTTT GCATCACGGC 1140 

TCACAOCA^A TTTCft06A6A AAATGClGTGG COSCCACGGA TGAAC6XGTC CM3AAGATQA 12 DO 

RTGAAGTTCT TACTTRCWrT AAATTTATCA AAAT6XATGC CTGOCSTCAAA GCATTTTCTC 1260 

AGAQTGTTCA AAAAATCCQC GfijtSGAGGJiGC GTOGQATATT GGAAAAAGCC GGGTACTTOC 1320 

AGGGTATCAC TGTQGGTGTG GCTCOCATTG TGGTGGTGAT TaCOVSCGTG GTGACCTTCT 13 00 

CTGrrCRTAT GRJtXXrrGQQC TTCGATCTGR CAGCAGCACA GGCTTTCACa GTOOTGACAG 1440 

TCTTCAATTC CATOWirrTTT GCTTTOAAAG TAACACCGTT TTCAGTAAAG TCCXTTCTCRG 1500 

AAGCSTCAGT GGCTGTTGAC AGATTTAAGA GTTTGTTJCT AATQGAAGAG QTTCACATGA IS SO 

TAAAOAACAA ACCAGOCS^T CCTCACATCA AGATAGAGAT GAAAAATGCC ACCTTOGCAT 1E20 

GGG&CTCCTC CCACTCCAOT ATCCAGAACT CQCCCAAGCX GACOCCCAAA ATQAAAAAAG 1680 

ACAAGAGGGC TTCCAGOGGC AAGAAAGAGA AGGTGAGGCA GCTGCAGOGC ACTGAGCATC 1740 

ASGOGGTGCT GGCAiSAGCAfi AAAGGCCACC TCCTCCTGGA CAGTCACX3Aa CBGCCCfiGTC 1800 

CCOAAGAGGA AGAAOGCAAG CACIVTCCACC TGOGCCACCT GCGCTXACAG AGGACACTGC 1S€0 

ACAGCATOGA TCTGGAGATC CAAGAGGGTA AAClGGTTGG AATCTGOSaC j^GTGTGGGAA 1920 

OTGQAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAOAT QACGCTTCTA GAGGGCAGCA 1980 

TTGCAATCRG TGOAACCTTC GCTTATGTISG CGGAGC2VGGC CTGC5ATCCTC AATGCTACXC 2040 

TOaGAfiACAA CATCCTGTTT GGC3AAQGAAT ATGATGAAGA AAGAlJACAAC TCTGTGCTGA 210 0 

ACAGCTGCTG CCTOAGGCCT GACCTGGCCA TTCTTC0CA6 CAOCGACC^ A12QGAGA3TO 2160 

OAGAGOGAQG AGCC7UU:CTG AGCGGTGGGC AGOGCCAGAG GATCAGCCTT GCCGGGGCCT 222 O 

TGTATAGTGA CAQQAGCATC TACATCCTGQ ACBAC5CCCCT CRGTGCXrrTA QATGCOCRTG 2280 

TGGQCAACCA CATCTTCAAT AGTGCTATCC GGAAACRTCT CAAQTCC3UW5 ACAGTTCTGT 2340 

TTGTTACCCA OCAGTTACAG TACCTGGTTG ACTOIGATGA AGTGATCTTC ATGAAAOAGG 24O0 

GCTGTATTAC GGAAA6M3SC ACCX^ATGAQG AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTG CTOGGAGASA CACX33CCAGT TGAGATCAAT TCAAAAAAGG 2520 

AAACOUSTGG TTCACAGAAG AAaTCAC3>kAG ACAAGGGTCC TAAAAiQWGGA TCAGTAAAGA 2580 

AGGAAAAABC AQTAAAGCCA GAGGAAGGGC AGCITGTGCA GCXGGAAGAS AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATQCSTGlCT ACATCCAOGC TGCrGGGGGC CCCTTOaCAT 2700 

TCCTQGTTAT TAT06CCCTT TTCATGCTGA ATGTAGGCAG CACC3QCCTTC AGCACCIGGT 2760 

G6TTGA5TTA CTGGATCAAa CAAGGAAGGG GGAACACCAC IXSXGACTOGA GGSAACGAGA 2820 

CCTOGGXGAG TGACASCATG AAOGACAATC CTCATATGCA GTACTATGCC MSCKSCThCB 28 BO 

CCCTCTCCAT GQCAOTCAOXS CTGATCCTGA AA6CCATTGG AGGAGTTGTC TTTGTCAAGG 2940 

GCAC3GCIGCG AGCTTCCTCC CGQCTGCATG A06AGCTTTT CCGAAQSATC CTTOGAAGCC 3000 

CTATGAAGTT TTTTGACACG ACCOOC31CAO GGAGOATTCT CAACRGGTTT TCCAAAGACA 3060 

^GGATGAAGT TGAGGT60GG CTGCGGXTCC AGGCOGABAT STTCKTCCHG AAOGTTATGC 3120 

Tt^GTOTTCTT CTGTGTGOGA ATGATOGCAG aAGXCITCCC GTGOTTCCIT <7IGG<3uaTG6 3180 

GGOGCCTTGT CATCCTGTTT TCAGTCCTGC ACAXIGTCIC CAGG6TCCTG ATTCQGOAOC 3240 

TOAAGGSTCr GGAiCAATATC ACGCAOTCAC CTTTCCTCTC CCACATCAOS TCGM3GATAC 3300 

AjGGGOCTTGC CACGATCCAC GCCTACAATA AAGQQCAOGA GTTTCTGCAC AQATACCAGG 3360 

AGCIGCTGSA TGACAACCAA GCTCCTTTTT ITTKTTTTAC OTGIGGGAXG GGGTGGCEGa 3420 

CTOTCEOGOCI GOACCTCATC ASCATCQGC3C TCATCACXAC CAOQGQGCia ATGATCGITC 3480 

TIATGCACGG GCAGATTCOC COWSCCTATG OSGGTCTCQC CHTCTCTTAT GCTGTOCWBT 3540 

TAAOGQQGCr QTTCCAGTTT AOGGTCAfiAC 1GGCATGTGA GACAGAAGCT CGATTCAOCT 3600 

CQGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCrT GGAAGCRCCT OOCAOAAOTA 3660 

AfaAACAAGGC TCCCTCCCCT GACTGGCCCC AGGAGGGAGA 6GTGACCTTT GAGAAOSCAG 372Q 
AGATOaCGTA COGAGAAAAC CICOCTCTTG TOCTAAAGAA AGTATCCTTC AOQATCMAC 

CTAAAGAOAA GATTGGCATT GTGGGGOGGA CAGGATCAGG GAAGTOCTCG CTGGOGiATGG 3840 

CCCTCTTCCG TCTGGTGGAG TTATCTGSAG GCTGCATCAA GATTGATGGA GTGAGAATCA 3900 

GTGATATTOG OCTTGCCGAC CTC5CQRAI3CA. AACTCTCIAT CATTOCTCAA GAGCCGGTGC 3960 

T6TTCAGTGG CACIGTCAOA TCAAATTTGG ACOCCTTCAA CTCAStACACT GAAGACCABA 4020 

TTTGGGATSC CCTGOAGAGG ACACAiCATGA AAIC3AAIGTAT TGCTCAQCTA CX!TCIGAAAC 40 BO 

TJ6AATCTGA AflTGATOOitf* AATGGGGATA ACTTCTCROr GG6GGAACGG CAGCTCTTGT 4140 

GCATAGCTAG AOCCCTGCTG CGCCACTGTA AGATTCTGAT TTTAGATBAA GCCACAGCTG 4200 

CCATGGACAC AGAGACAGAC TTATTGATTC AAG7U3ACX:AT CCGAGAAGCA TTTGCAGACT 4260 

GTACCAICcr GAGGATTGCC CATOGCCTGC ACAiOGGTTCT AGGCTOCGAT AGGATIATGG 4320 

TSCTGGCCCA GGGACAi^TG GTGGa^GTTTG ACACCOCnXC OGTCCTTCTG TCCAACGACA 43 BO 

GTTCCCGAtT CXATGOCAT6 TTTGCTGCI6 CAGAGAACAA GOTCXiCnGTC AAOGGCTGAC 4440 

TCCTCCCTGT TGAOGAAGTC TCTTTTCTTT AfiAGCATTGC CATTOCCTGC CTG6GGC0GG 4500 

CCCCTCATCG QGTOCTCCTA CCOAAACCTT GOCTTTCTCG ATTTTATCTT TOGCACAGCA 4560 

GTTCOGGATT GGCTTSTGIG TTTCACXTTT AGOGAGAGTC ATATTTTGAT TATX6TATXT 4620 

ATTCCATATT CAT6TAAACA AAATTTAaTT Tl"r( J i ''lX.- r i' A ATTGCACTCT A&AAGGTTCA 4680 
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OOGAACCGTT 
TCTATATATA 
TATTAAAATA 
TTGCTGTACT 
CTCTAGCTGG 
ATAOTOBGGC 
GAGACGG6T6 
CTGTOCTGGT 
TTTCACTCCC 
TTTCCTGCCX 
TCCCACTC30C 



ATTCCCACAC 
CTCACOGCAG 
CAGCTCTTGC 
ACCTCAGGTT 

GGGacraoTA 

ATGTOGTGAC 
CAAAAATCTO 
AAAAAAAAAA 



ATTATAATTG 
ATTCTGTACA 
AOCACT6TGC 
AGAGATCTGG 
TQGTTTCACG 
CTCCGACAGC 
fSGOGGCTGGA 
GTCACTTACr 
TCCATCAAGA 
TCTTCTTTTT 
TCAGGTTCCT 
AGCCXTGGAG 
CTCCACAGTT 
TOGTCGCACA 
TAATCAGT6T 
GCIOGTTGCT 
GCTCAGGTGG 
CAACTAGACA 
AAAATQTGAA 
AAAAAAAA 



TATCAGAGGC 
TAECCTATAT 
TAATAACAGT 
TTTTGCTATT 
GTQCCAGGTT 

qaccatgcag 
gtttctgtca 

ATQGGQATCA 
GCTGTTGTTT 
ATCSaCTOGCC 
GCAACTGCTG 
CAOTGGCZAGG 
GTCTCTCTCT 
CTCACACTGG 
GTGTGGTTTG 
GCQTGGTCAC 
TTCTGTCGCC 
TAAAA7TATT 



CTATAATGAA 
TTACAGTGAA 
GCAIATTCCX 
AGACTGTAGG 
TTCTGGGTGT 
GCCTCCOCAC 
AG06CCGTGA 
GGAGAGCAGC 
CA3AGACATT 
CTAAACAAGA 



CTTTTTGAGG 
GCrrC&GGATT 
CTCTCTCCCC 
CGTAGAAGTT 

tgctgtcatc 
ttagcatgtt 
ttqgattttg 



GCITTATAGS 
AAT^AAGCT 
TTCtATCATT 
AAQAGTAGCA 
CCAAAGGAA6 
AGCOGCXCCA 
GrrCTCAGGG 
GGGGOQAAGC 
CCTCCGAGCC 
ATCftGTCTAT 
GCTCTCCAGC 
TG6CACTTTT 
TOC5T13GC3TCT 
TCAAAGTCTG 
TTTGTACTGT 
GCAAACCCGC 
AGTTQAATGG 
TGCTGAACSU: 
*rAAAAAAAAA 



TGTAGCTATA 
GTTTATTTTA 

TTTGT/WCflST 
TTTCATTCTT 
ACGrrGTGGCA 

CTC5CTGCCTT 
OCAGGCCOCT 
GGGGAGTTTG 
CCACAGAGAG 



TCATTTGCCT 

GTrrrccTTT 

CAACTTTAAG 
AAAOAQAOCT 
TTTGTGCTGT 
TCAGCGTTGC 
CTTGTGGAAG 
AAAAAAAAAA 



4740 
4800 

4920 
49B0 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5G20 
5838 



Seq ID KO: C106 WA Seguence 
Iffucleic Acid AcccsbIoii NM_0055G2 
Oodlng seqcuences 90.. 3 671 



1 11 21 31 41 SI 

111111 

AiCAGCaOAGC (3CAGA6TGAG AACCACCAAC OGAGGCGCC6 GGGAGOGACC CCTGCAGCGQ 50 

AGACAGAGAC TGAGCGGCCC GGCAOCGCCA TGCCTGOGCT CTGGCTGGGC TGCTGCCTCT 120 

GCITCICaCT CXZTCCTGCCC GCAQCCCGGG CCACCICC3^G 6AGGGAAGTC TGTGATTGCA 180 

ATGGGAAGTC CA6GCAGTOT ATCTTTGATC GGGAACITCA CAGACftAACT GGTAAXGGAT 240 

TCCGCTGCCT CAACTGCAAT 6ACAACAC1G ATGGCATTCA CTGCGAGAAG TGCAAGAAT6 300 

GCTTTTACOG GCACAGAGAA AGGGAGCGCT GTTTGOCCTC CAATTGTAAC TCX3\AAGGTT 360 

CTCTTAfiTOC TC33ATQTaAC AACTCTGGAC GGTGCAGCTG TAAACCAGGT GTGACAGGAG 420 

CCAGATGGGA CCX3ATGTCT6 CCAGGCTTCC ACATQCTTCAC GQATGCXSGGa TGCACCCAAG 490 

ACQ^GABACT GCTAGAiCTGC AAGlGTOACT GISACGCAGC TGGCATG8CA GGGCOCTGT6 540 

ACGCOGGOOS CTOTGT C TGC AAGCCAGCTG TTACTOOAGA ACGCTQTGAT AGOTOTCaAT 600 

CAGGTTACTA TAATCTQGAT G6GGGGAACC CTGAGQOCTG TACCCAGTGT TTCTGCTAlG 660 

GGCATTCAGC CAGCIGCOGC AGCTCTGCAG AATACAGTGT CCZATAAGATC ACX^CTACCT 720 

XTCAXCAAGA !i:GTTGATGaC TGGAAOGCXG O'OCAAOGAAA TGGGTCXCCT GCAAAGCTCC 780 

AATOGTCaUIA GOGCCATCAA GATGTGTTTA GCTCAGCCXai ACGACTAGAC CCTGTCTATT 840 

TTGIGGCTCC TOCCAAATTT CTTGGGAATC AACAGCTGA6 CXATGG6CAA AGGCIGTCCT dOD 

TTGACTACCXS TGTGGACAGA GGAGG<3iGAC ACCCATCTQC CXZATBATGTG ATTCTGGAAG 960 

GTGCIGGTCT AOGGATCACA GCTCCCMGA TGCXIACTTGG CAAGACACTG CCT1GTG6GC 1020 

TCA0CAA6AC TTACACAMC AG6TTAAATG AGCATCCAAG CAATAATTGG AGCCCCCAGC 1080 

TOAGTXnCTT TQAGCKXCaA AGGTZACTGC GGAATCICAC AGOCCTCGGC ATOCGAGCEA 1140 

CATATGGAGA ATACAGTACT GGGTACATTG ACAATGTGAC CCTGATTTCA GCOCGOCCXG 1300 

TCXCTGGAGC CX!(3U3(2ACCC TGGGTIGAAC AGTGTATATG TCCXGTIGGG TACAAGGGGC 1260 

AATTCTtSCCA GGATTGTGCT TCTGGCTACA AOAGAGATTC AOCJaAGACTG GGGLXTi'T'iXJ 1320 

GCACX:TGTAT trCCTTGTAAC TGTCAAGOGG GAGGGGOCTG TGATCCAGAC ACAGGAGATT 1380 

GTTAITCAGG GOATGAGAAT CCIGACATZG AaTOTaCTOA CTGOOCAATr OQTTTCTACA 1440 

AGGATC06GA OSAOCCOCGC AGCTGCAAGC CATOTOOCTG TCATAAOGGS TTCA6CIGCT 1500 

CAGTGATGCC GQAGACGGAG GAGOTGGTQT GCAATAACTG CCCTGCOSGQ GTCACOGGTG 1560 

CCOGCTGTGA GCTCTGTGCT GATGGCTACT TTGGGGACCC CTTTGGTGAA CATGGOCCAG 1620 

TGAGGCXrrirG TCAGCCXTIGT CAATGCAACA ACAATGXOGA CCOCSUSTGCX: TCTGGGAATT 168 d 

GTBACCSGGCr QACAGQCAGG TGTTTQAAGfT GTATCCACAA CACABCGGGC ATCTACTGCQ 174 Q 

ACCAGieCAA AGCAGGCXAC TTOSaGGACC CATtGGClCC CAAOCCAGCA GACAAGTGTC 1800 

GAGCTTGCAA CTGTAAOOCC ATGGGCTCRQ AGOCTGTAGQ ATGTCEAAST GATGGCACCT 1860 

GTGTTTGCAA GCCAGGATTT QGrGGCOXaV ACTGTGAGCA -rGGAGCATTC AGCXGTOCAG 1920 

CXTGCTATAA TCAAGTQAAQ ATTCAGATGG ATCAGTZTAT GCZAQCAlGCTT CAQAGAATGG 19 BO 

AGGCCCXGAT TTCAAAGGCT CM3GGTOCT8 ATWV&TAGT ACCTGATACA GAGCTGGAAG 2040 

GCAGGATGCA GGAOaCTCaAa CAGOCCCTTC AGGACRXTCT GAGAGAIXSCX: CAGAtTTCAG 2100 

AAGGTGC7AG CRGATCCCTT GGTCTOCfiGT TGGCCAAGQT GAGOAGCCAA GAGAACAGCT 2160 

ACCAGAGCGG CClGGATGftC CTCftflGATGA CTGTGGAAAfl AGTTOSGGCT CIGGGAAGTC 2220 

A^aXAOCAOAA CCX3AGTTOGG GATACTCACS^ GGCTCATCAC TCAGATGCAlG CZGAOCCroa 2280 

CAGAAAGTG& ASCTTCCTTG GGAAACACIA ACATTOCT6C CTCAGACCAC TACGTGGGGC 2340 

CAAASGGCET TAAAAQTCTO GCTCAGOAGQ CCACSLAQATT AOCAGAAAGC CACSOTTOAGT 240O 

CAGCCAGTAA CATGGAGCAA CTGAiCAAGGG AAACIGAG6A CTATTCCAAA CAAGCOCrCT 2460 

CACTOGTGQG CAAGGCCCTG CATCSAAQGAG TCGGAAGGOG AAOCGGTAGC CCGOACGGTG 252 0 
CTGTGGTGGA AGGGCXIGTG OAAAAATTGG AGAAAACCa^A GtCCCOfGGOC CAGCAGTTGA 2580 
CAAGGOacraC CACTGAAGOG GAAATTGAAG CAGATAGOTC TTATCAQCAC AGTCTCOGCC 2640 
TCXrrGGAtXC AS T GTCTCGS CTTCAB6GAG WAGIGATCA GTCCTTTCSUS GTGGAAGAAG 2700 
CAAAGAGGAT CAAACAAAAA GOSGATTCAC GGTAACCAGG CATATGQATG 2760 
AGXrCAAGOG TACACAAAAG AATCTGGGAA ACTGGAAAGA AGAAGCACAG CAGCTCZTAC 2 820 
AGAATGGAAA AAGTGGGAGA GAGAAATCAG ATCAGCTGCT TTCXXX3T6CC AATCTTGCTA 2 880 
AAAGCAGAGC ACAAGTkAGCA CTGAGTAIGG GCAATGCCAC TTTTTATGAA GTTGAGAGCA 2940 
TCCTTAAAAA OCTCAGAGAG TTTGACCTGC AGOTOOACAA CAGAAAAGCA OAAGCTQAAG 3000 
AAGCCATGAA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGOC3WJT GACAAGACCC 3060 
AaCAAGC3U3A A7U3AGCCCTG GGGAGOGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGO 3120 
CCQQGGAGGC CCTGGAAATC TCCA6TGAGA TTGAACAfiGA QATTOGQAGT CTGAACTTGG 3180 
AAGCCAATQT GACAGCAGAT GGAGOCTTGQ CCATGGAAAA GGGACTGGCC TCTCTGAAGA 3240 
OTGAGATGAG GGAAQTGOAA GGAGAGCTGG AAAQGAAGGA □CTaaAOTTT GACAOGAATA 33 QO 
TOGA3X3CAGT ACAGATGGTG ATCACAOAAG OCCAGAAGOT TGATACCAGA GCCAAGAACG 3360 
CTGGGGTXAC AATCCAAGAC ACACTCAACA CATTAGACGG GCTOCTGCAT CTGATGQACC 3420 
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RGCX^CTCAG TGTAQATGAA GAGOGGCTGG TCTTACTGGA GCAGAAGCTT TCCXXSftGCCA 34 BO 

AOACCCRGAT CAACAGCCAA CTGOSGCCCA TGATGTCJM3A GCTOOARISAG AOGGCACGTC 3540 

AGCAeAGOGG CCACCTCCttT TIGCXGGMA CAAGCATAGA TGGGATTCTG GCTGATGTGA 3 600 

5AGAACTT06A GAACATTAGG GACAACCTGC CCCCZAGOCTQ CTACRATACC CA£3aCTCTTG 3 660 

AGCAACABTG AAGCTGCCAT AAATATTTCT C3UVCTGASGT TCTTGQGATA CAGATCTCAG 372 0 

CGCTCGGGAG CCATQTCATG TOAGTGQGTG GGATCGOOAC ATTTOAACAT QTTTAATGGG 3780 

TATGCTCR<3G TCAACTGACC TGACOCCATT CXTTGATCCCA TGGCCAQGTG GTTCTCTTAT 3840 

TGCAGCATAC TCCTTGCTTC CTGATQCTGQ GC3VATGAGGC AGATAiGCACT GGGT6TGAGA 3 9 00 

^ ATGATCAAGG ATCTGGACCC CAAAGAATAG ACTGGATGGA AAGACAAACT GCACAOGCAQ 3960 

IV ATGTriGCCT CATAATAGTC GTAACTTGOAiG TCCTGGAATT TOGACAAGTG CTGTTOGGAT 4020 

ATAQTCAACT TATTCTTTGA GfAATG-TGAC TAAAGGAAAA AACTTTGACT rrrOGCCAGOC 4080 

ATGAAATTCT TCCTAATGTC AGAACA<3«3T GCAACCCAGT CACACTGTGG CCAGTAAAAT 4Z40 

ACTATTGCCT CATATTG^CC TClGCAAGCT TCTMCTGAT CajGAjGTTCCT CCTACTTTACA 4200 

^ ^ ACCCAGGOTQ TGAACATGTT CTOCATTTTC AAQCTGGAAjG AAGTGAGCAG TGTTGGAGTG 42 60 

I J AGCSACXTOTA AGGCAGGCCC ATTCAGAGCT ATGGTGCTTG CTGGTGCXZTO CCACCTTCAA 4320 

GTTCTGG ACC TOGQCATGAC ATCCTTTCTT TTAATGATGC CATGGC7VACT TAGAGAITGC 43 BO 

ATTTTTATTA AAGCATTTCC TACCAGCAAA GCAAATGTlG GGAAAGTATT TACTTTTTCX3 4440 

GTTTCAAAfSr GATAGAAAAQ TOTGGCTTGG GCATTGAAAG AGGTAA7UVTT CTCTAGATIT 4500 

ATIAOTCXTA ATTCAATCCT ACTTTTCGAA CACCAAAAAT GATGCGCATC AATGTATTTT 4560 

ZU ATCTXATTTT CTCAATCTCC TCTCTCTTTC CTCCACCCAT AATAAGAGfAA rGTTCCTACT 4620 

CACACTTCAG CTGGGTCACA TCCATCCCTC CATTCATCCT TCCATCa<kTC TTTOCATOCA 4680 

TTACCTCCAT CCATCCTTCC AACATATATT TATTQAQTAC CTACTGTGTG GGAGGGGCTG 4740 

GTGGQACAGT GGTGACATAG TCTCTGCCCT CATAGAGTTG ATTGTCTAGT GAQGAAGACA 4000 

AGCATTTTTA AAAAATAAAT TXAAACTTAC AAACTTTQTT TQTCACAAGT GGTGTTTATT 4860 

ZD GCAATAACCB CTTGGTTTGC AACCTCTTTG CTCAACAGAA C&TATGTIGC AA^CCCTCC 4920 

CATOGGGGCA CTTGAGTrCT GGCAASGCXG ACAGAGCTCT GGaTTQTQCA CATTTCTTTC 4980 

CATTCCAGCT QTCACTCTOr GCCTTTCTAC AACTSATTQC AACAGACTGT TGAGTTA1GA 5040 

EAACACCAGT GGGAATTGCT GGAGGAACCA GAGGCACTTC CACCIIGGCT GGQAAOACIA 5100 

TGGTGGTGCC TTeClTCTGT ATTTCCTTGG ATTTTCCTOA AAOTOTTTTT AAATAAAGAA 5160 

j\} CAATTQTTAO ATGCC 51 75 



35 



70 
75 
80 



Seq ID NOx C107 DMA Sequence 
Nucleic Acld Accession Mz NM_ozilOl 
Coding sequences 221 . .856 



1 11 21 31 41 51 

in ' ' 1 I 1 

GAATTCQOCA DBAGGOCTOG TGCOGGOGAG CAAOCTCAGC TTCTAGTATC CAOACTCCAG 60 

0SCGGGCCX3S GGCGCGGftCC OCSULCOCCGA CCCSUSftflCIT CriTCCAGC3GGC GGOGCAGOGA 120 

4U SCAGQGCTOC CCGCCTTAAC TTCCTCCGO© GGGCCC3V3CC ACCTTCOGGA GTCCGGGTTa 180 

CCCACCTGCA AACTCTCCGC XTSrcClGCfJllC TGCCACXXOT aAGCC3U3C5GC GGGCGCOCGA 240 

GOGAfiTCATQ QCCAAOC3CX3G GGCTGCAGCT GTTQGGCTTC ATTCTCGCCI TCCTGGGATO 300 

□ATCQGCGOC ATCSGTCACCSi CTGCCCIGCC CCAGTGGAOG ATTTACTCCT ATGCCQGOGA 360 

- CAACATOG!IG ACCGCCCAG6 CCATGTAOQA GGGGCTGTGG ATGTCCTGCG TGTCGCAQAQ 420 

^D CAOOGGQCAG ATCCAGXQCA AAGTCTTTGA CTOCl"i W iX3 AATCTGAGCA GCACATTGCA 4B0 

AGCAACCCGT GCCTTGATGG TQGTTQGCAT CCTCCTGGGA GTGATAGCAA TCTTTGTOGC 540 

CAOOGTTGGC ATGAAGTGTA TOARJSTGCTT aOAAOAOGAT OAGGTGCAGA AGATGAGGAT 600 

GSCTGT CATT GGGGGC3GCSA TATTTCTTCr TGCAGGTCTG GCTATTTTAG TTOCCACaGC €60 

ATGGTATGGC AATAGAATCG TTCAAGAATT CIATGAlCCCT ATQACXZCCaG TCAATGCCAG 720 

DU GTACJGAA^T GOTCAGGCTC TCTTCACTGG CTGQGCTGCr GCTTCTCTCT 6CCTTCT<5GQ 780 

AGOTGOCCTA CTTTGCTGrr CCTOTCCCG6 AAAAACMCXI TCTTACCCAA CACCAAGGCC 840 

CTATCCAAAA CCTQCACCXT CCAGCGEQRA AOACTACGTG TGACACAGAG GCAAAAGGAG 900-' 

AAAATCATGT TGAAACRAAC CGAAAATGGA CATTGAGATA CTATCATTAA CATI3W3GACC 560 

TTAGAATTTl' GGGTATTGTA ATCFAAAGTA TGTTATCACA AAACSUVACAA ACAAACAAAA 1020 

->D AACQCATOTQ TTARAATACT CAGTGCTAAA CATGGCTTAA TCrCATTTTA TCTTCTTTOC 1080 

TCAATATAGG AGGGAAGATT TTTOCRTTTG TATTACTGCT TOGCATTQAG TAATCATACT 1140 

CAAATGOGOa AAGGGGTGCT CCTTAAATAT ATATAlGATAT OlAXATAtAC ATQTTTTTCT 1200 

A7TAAAAATA GGC2U3TAAAA AAAAAAAAAA AAAAAAA 1237 

60 Seq ID HQs ClOa DNA Sequence 

Iffoclelc Acid Accession #: AF50e964.1 
CnAi n J sequence 1 98.-1531 

jj< a- 3.1 21 31 41 SI 

CAGAGCX^GCA AGCBCACOGA AGQCCTCCCC BCACX3GTGGG GGAAAGOGGC OGGTCCAGCG 60 

CQGTOACAGG CACTCG^CT GGCACTGGCT GCTAGGGATG TCGTCCTGGA TAAGGIGGCA 120 

TGGACOCGOC ATG606GGGC VCXGGGGCTT CTGCTGCCTG GTTGTGGGCT TCTSOAGGGC 180 

OGGTTTCQCC TGTOCXZACBT CCTGCAAATG CAGIGCSClCT CSGSKTCTGOT GCAGOGACCC 240 

TTCTCCTGGC ATCGnsGCAT XTIX^GAGATT GGASOCIAAC AGTGnUSAXC CTGAGAACAT 300 

CACJGGAAATT TTCTTCGCAA ACCAGAAAAG GTTAiSAAATC ATCAAOSAAG ATGATGTTOA 360 

AGCTTATGTG GGACTGAGAA AXCIQAGAAT TGTGGATTCT GGATTAAAAT TTQTGGCTCA 420 

TA AM3CKIT T CTOAAAAACa GCAAOCIGCA GCACATCAAT TTTACCXSOAA ACAAACTGAC 480 

GAGTTTGTCT AOGAAACATT TOCGlCACCT TGACrTGTCT GAACIGATCX: TGOrGGOCAA 540 

TCCATTTAX^ TGCTGCrGltS ACATTATQTG GATCAAOACT eiX3CAAGAGG CXAAATOCAS 600 

TCCAGACACT OySGATTTQT ACTGCCTGAA TGAAAGCAGC AAGAATATTC CCCTG(?(ZAAA 660 

CCTGCAGATA CCCAATTGTG 6TTTGCCATC TGCAAATCTQ GCOSCACCTA AOCTCACTGT 720 

GGAGGAAGOA AAGTCTATCA CATTATCCTG TAGTGTGGCA GGTGATGCG6 TTCCIAATAT 7 BO 

6TATTGGGAT GTTQQTAACC TGGTTXCCAA ACKZATGAAT GAAACAAGCC AGACACAGGG 840 

CTGCTTAAGG ATAACTAACA TTTCATCCQA TGACAGTGGG AAGCRlOATCT CTTaTGTGGC 90O 

6GAAAATCTT GTAGGAGAAG ATCAAGATTC TGTCAAOCTC ACTGTOCATT TTCCACCAAC 960 

TATCACATTT CTOGAATCIC CTVACCTCAOA CCACCACTGG TOCaTTCCAT TCACTQTGAA 1020 

AGGCAACCC3C AAACCAGCGC TfTCAGTGGTl7 CTATAAOGGG GCAATATTGA ATQAGTCCAA 1000 

ATACaSTCTGT ACTAAAAIAC ATGTTACCAA TCKCAOaQMS TAOCAOGGCT GCCTCCAGCT 1140 
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GGATAATCCC ACTCACATGA ACAATGGGGA CTACACrCTA ATAGCCAAGA ATGAGTATGG 1200 

GAAGGATGAG AAACAGATTT CTGCTCACTT CATOGaCTOG CCrTOQAATTG ACGATGGTGC X260 

AAACCCAAAT TATCCTGATG TAATTTATGA AGATTATGGA ACTGCAGCGA ATGACATCXSG 1320 

^ G6ACACCAC3Q AACZAQAAGTA ATGAAATCX:^: TTCCACAQAC QTCACTGATA AAACC6GTCG 1380 

5 G6AACATCTC TOGGTCTATG CTGTGGTGGT GATTeCGTCT 6TGGTGGGAT TTTGCCTTTT 1440 

GGTAATGCTG TTTCTGCTTA AGTTGGCAAG ACACTCCAAQ TTTQGCATGA AAGGTTTTGT 1500 

TTTGTTTCAT AAGATCCCAC TG6ATGGGTA GCTGAAATAA AGGAAAAGAC AGAGAAAQGG 1560 

GCTGTGGTGC TTGTXGGTTG ATGCTQCCAT GTAAGCTGGA CTCXTTCGGAC TGCEGTTGGC 1620 

TTATCCOGGG AAGTCSCTGCT TATCTGGGGT TTTCTGQTAfi ATGTGGGCGG TGTTTGGAGG 1680 

lU CT6TACTATA TGAAGCCTGC ATATACTOTG AGCT6TGATT GGGGAACACC AATOCRQAGG 1740 

TAACTCTCAG GCAGCTAAGC AGCACCTCAA GAAAACRTGT TAAATTAATG CTTCTCTTCT 1800 

TACAGTAjSTT CAAATACAAA ACTGAAATGA AATCCCATTG GATTGTACTT CrCTTCTOAA lafiO 

AAGTGTGCTT TTTGAECCTA CTGGACATTT ATTQACTTAA TTtaCTTCTGT TTATTAAAAT 1920 

TGAC3CTGCAA AGTTAAAAAA AAATTAAAOT TGAGAACAGG TATAAGTGCA CACTGAATAS 19 BO 

15 TCTAATCTAC ATQTAACACA TATTTTA6TA TGATTTTCTA TACTCTAATC AGCACIGAAT 2040 

TCAGAGGGTT TGACTTTTTC ATCTATA2U2A C3U3TGACTAA AAGAGTTAA6 GOTATATATA 2100 

CCATCACTTT GGGACITGGT AGTATTATTA AAAGQTTATT TCCTTCACrTG TCAATAAAAG 2160 

TCCAAAIGTT TAGCTTAGGT CTOAGAGTCA AACAATGTTA AGGATTGTCT TAAAQTTCCT 2220 

TAGOCAGCAA AACAAAACAA AACAAAACAA ACAAATOAAA iUlCGTirTAAA AAGAAGAAGA 2280 

20 AGAAAAAAAA CAAGAACAAG CAGCAAJCAGC ' i X^- m 'IXSTTG GGGCTATAGA TTTAAGTTAG 2340 

GCATAGTCAA TTTCAfiAATA ACTAAGAGTG GAATATATQC ATATGGTGAA ATTATAAOCT 2400 

TGCCCrrrTT TATTTGCCCT CTGCSQATCCA CCTQGCTTTT TACAAGTCTO CCSQAGTOAGA 2460 

AQGCCACAGT ATCTCATGCT GTTTGCATTA CAQAACTGCA GCTTTrCTAC TCTGAAAAGG 2520 

OCTGGGAGCA GAATCX^CIGG CCTGCTG1X3A GCAGGAGAGG AGATTCTAAG AAGGATAGTC 2S80 

25 CCCCCTACAA CATACTGTCA TACTGCrGGG TTTTC3VIGGG TAGGAAAGCT TGTCCTGACC 2 640 

CCAGCAGCAA AGAGGTOGCA GGTCGCTAAT GAATATATGC TTTATAATGT CCTTCTlCAT 2700 

TGCTGAGftSG GCAGCCTTAG AGCTGTOGAT TTCTGCATCC CCCCTGAGTC TGACCCATQQ 2760 

ACACCTGTTT CATTCACTTT AGCATCACAG TGACCTTTGT ATGCTCTGTT CAGTCTGTGT 2820 

CAGGCAGTAT GCTTGTCXTTQ AAQAQAGGIT TGGCTATCCC CX».CCCCACC CACCCACCCX 2880 

30 GTTCCTTTTT TATCAGGAGG ACTTCAGAGC CAGGCCTGCA GCATTTTGTT TGAAAACACA 2940 

ATCAGCTCTG ACAGTTAGAC ATGCACACAG AOGOCATAGC TGOATTOOAA ACRTIGATGT 3000 

TTTAAAAATT TATTTTTTTT GGAAATAGTT GCavCAAATGC TGCAATTTAG CTTTAAGQTT 3060 

CTATAGATTT TTAACTAGTC CAACACAGTC A6AAACATTG TTTTGAATCC TCTGTAAACC 3120 

AAQQCATTAA TCTTAATAAA CCAGGATCCA TTTAiSGTACC ACTTGATATA AAAAGGATAT 3180 

35 CCATAATGAA TATTTTATAC TGCATCCTXT ACATTAGCCA CTAAATAOiT TATTGCTTGA 3240 

TGRAGACCTT TCACAGAATC CTATGGATTG CAGCATTTCA GTTGGCTRCT TCATACCX:AT 3300 

GCCTTAAAGA OGGGCAGTTX CTCAAAAGCA GAAACATGCC GCCAQTTCTC AAGTTTTOCT 3360 

GCXAACTOCA l:l:TGAATGTA AGGGCAGCTD GGCCCCAATG TGOGGAGGTC CGAACATTTT 3420 

- CTGAATTOCC ATTTTCTTGT TaSCGGCTAA ATGACAOTTT qTOTCATTAC TTAGATlTCCG 3480 

40 ATCTTTCCCA AAGGTGTTGA TTTACAAaOA QQC!C2RjSCTAA TAtKftGAAAT CATGACCCTG 3540 

AAAGAGAGAT GAAATTCAAG CTGrGAGCCA GGCAGGAGCT CAGTATOGCA AAGGTTCr«3 3600 

AOAATCAGCC ATO^rGGTACA AAAAAGA7TT TTAAAGCTTl TATGTTATAC CATGGAGOCA 3660 

TAGAAAGGCT ATGGATTGTT TAAGAACTAT TTTAAAGTGT TCCAGACCCA AAAAGGftAAA 3720 

i^AAAAAAAA AGGAAIATTT GTACCCAACA GCTAGAAjGGA TTGCAAGGTA GATTTTTGTT 3780 

45 TTAAAATGGA GAGAAGTGGA CAGATAAGGC CATTTAATAT ATCAAAGATC AGTTOACATC 3B40 

TCTAiGGGAAT GATGAAACAG CAOGCTATTA QAAAATTATT TCATATAGIT CrOGTGTTCT 3900 

TTTCTTTTTT TTAATCXXTTG AAGGQAGATC AGTAACATAG CTTCTeTTTT CTGTACTCTA 3960 

GACCUCCCCI mCMPCATr TTGCTITTTA TQTCTOCCAT AAGAAATGTG CTTTTTAGAG 4020 

CTTCCTAATG CATGTQTTGC AUTATiGCAG CATTAGAAAA GG&GfiaSTAiS CATTTTTGCT 4080 

50 <3AAAT0GGGC CTGTCRCTCT CCAATAAAG3 TTClGGCSiCr TCRATGOCftG GCAGGTCTCC 4140 

TAAATGAACA GAATGATCXG TGTGAGCCGA TGCCTGCCXZT TtTCAQAGGGG CCAClGTCGC A20D 

CAGGOGCAGC CAACTGTGTC CCACAOOAAT GGGAGCCTAG GTTTCCAAAT CTTGTGATTC 4260 

TTTAGQAGAA ACAZTGAAAOC TGGKTTTOGT GTGAAA3GTC CXJGATTGTTA AAAAGTTGGC 4320 

TCAAXTATTT TTAAAACATT TTGTAAIcaCCA ACAAAAGTCT GIGQGCSGCC A3ITTATTAC 4380 

55 TTTTGTCITA AAACATGATC XrTGTTCTCT CACBOTATCC TTCX6TCTTC CGGTTGCAAA 4440 

TTCACTTTTC TTTCTTCCTQ ACATTGOCAT TGRGQGCTTT GTTAGC3U2AA GCTAAGAAAC 4500 

TOMSTXTAAjC ASCCCAGTTA TCTGCAACAT QTCAATTACC Ti' l 'GCrQCTC TOCTGT6ATT 4560 

OOCACCATGC TGTGACC3CrC AGCTGICTGC dTTGCTGGG AATTCTC3CAC CAATGTCTXC 4620 

^ COCrCI^AOCC ATTOCXTEGGT TGOTCCIACT CCO&S&IGGC C&jQAGACKrC CTAGCAAATC 4680 

OO CTTGCTOCTA TTATATCTGA CACIAATTTC TTTTCAACAO CGCTCSVDGTC TCIIGGCCCA 4740 

GTCAGGTGCT GCCAGGTTTA GATAGGAAAG 7ACATGTOCC ATTTTaU?GG GTOCCCTTAA 4B00 

TQIQGTCCAC GTCCTATATC TTATTATATT TACTCATGGC "ICAATGGGGG CCTCCAGAGA 4860 

OOCTCTCAGG CTGCTGAiGCr AGAlCTAASGA AIGCATCCAC TGTCATCACA TGAGAiCACEe 4920 

ACrCTCSlSAC 6ACAAAAGTA CAAACACXCCr GAGQCTrAAGA AAjSOTTCATC TCftCAACAGG 4900 

05 AAAAACAAAT CTCAACACAC ATTAGAGAXA ATTGATTCAG GGGTTZTCTG TOCCA6TCTC 5040 

CCaGCAGGGA CTGATTTCAT TTCTGACGOV CTAGGTTTTC TTTOCAGAAA TAGGTAGCAA 5100 

GOAiCaUkQAAC TAAACAATCC CAGCCOCACX: CAGCAACACA GAACACAGGA GrTTGCTTTT 5160 

GGCITCTCAC TCTGCAAGTA ACSCCTGAATT AGGGOCAGAA TGGCTGAGGC TTOGAGCATC 5220 

TCCTCZU3ACA GAGCftGAGGC GACACCXCTT CAGGGQTOI6 TGGAOIAAlVr AGCTCGAAGA 5260 

70 GCK3AAI3ACA GAAAACCAOT TTCAOSCSCSG GTGCGAGAGA GAGCATAATG G»GOGAASOC 5340 

CGCTTTCTCr CTCCTCITCT TTTCTCTTTA TTTCTTrAGA GCACTTQACT TTTTTTTCTC 5400 

TCTCTCTCTA GTATTCTAAA CTGACCTCAT QAOCAACrGA GAATTTATTT TTGTTTCATT 5460 

GGTTOTTrCA CAOAATTAGA ACACACA06A CTTTTTATTC CTCCATTGCA AAATGGAATC 5520 

AAl^TACxmC ACAAGACCtCG TGCTTTCITC CTriGCAIGA n*XACAOCTC CGGCTGTTTT S5B0 

/5 GOTGCTAGCT <3TCXnGAACT TCTCTCTTGG TITGAATCTG ATTCCTTCAC ACTACACTAG 5640 

AAGTTTAXrrr CATCTTGXTT TGXCTAGACT CCAGATACAG AGGGACAGCT GGACTGAGGA 5700 

<:AAGCAATTC CATCTAGCAT AIOGGTCTCTC AGGGTTGQTG CATCCAGCCA CATGGGCAGG 5760 

GCCAGTCACA TCTAGTCTAT GTG0CCAGA6 OXTTGGAGT IXSOGCASCTT AGCTGACTTQ 5820 

ACTCCIUIGGA AATTM3TACA GAAGTAACCA CXCZATTAAQ TOianClGC TATGCTCACA 5880 

oO TGCCrOTTkGT ACCIGCAAAC CATGOCAGGT TCATCTAAAG ACATAGGGGA AOATTAAGGA 5940 

CTCTTTT06A CAGACQ^GA ATTOAATTTG CTGCCAGGTG CTGCX:&GACT GAATTTGGCT 6000 

QACBiGAaCTC OC^GCOCAGG AAAGTTCCAT GACAATGACT GTCGCAGAAG GAAATTTCCC 6060 

ACTAAAGTCA GTCXJ^TTTTC AAGTTTTQGT CTTCAGAGAC AAAAGAACS^T GOCAGCCACC 6120 

TGATTTXaAT GGTGAGGTAA CTCTAAGTEG AATTCAiSGCT Aj ^Wll GCAG TATAGCTTT6 61B0 
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GCftTGTTCftT GAOTGAGCAC CCAGAATGTG TTGAACCftAC CCCCAOCCCT AACTACTGAC 6240 

TAT6ACTGCA GTGGGTTTTT ATGC3QSAAAA. AAAGTGT6AA AAGCAAAAA6 AAAGQAACftG 6300 

AGATTTTTTA TCACCTTrAT TGTAAORCAG TCCATTTATG AATTGAGTAT AAACACATAC €360 

AAAGTAACiU;. QAGATTCXITA AC5AAACGCAA ATCCTTOAGT TTCACGCACT TCATGTTCAA 6420 

CCAT TTGCTG TAATCCAGAG GCAGCCTGTa AATCATTCTC ATGCCCTGTT TTH-TTTTTT 6480 

TTTTCCTATA ATGTTCTGGG TTTAAAAGCC ATCTTTTC3CA CATTTTCTGT AAATAATGGA 6540 

TAATCATTTT AAAAAGTTTT ATTTTTAGTG CTGTTTTAAC AAXGTAGATA GATCATAAAT 6600 

OTACTTGCTG AATTCAATCA TTTTTAACAA GCCAATAMG ^TTlXaATAATT CUU^AAAAAA 6660 

AAAAAAAAAA AAAAAAAAA fig79 

Seq ZD IK): CL09 E(HA Ssquflnce 

Nucleic Acid Accession fti nm_dog 180.1 

Coding eequence t 352 2 82 0 

15 1 11 21 31 41 51 

i I I I I I 

CCCCCATTCG CATCTAACAA GGAATCTGCG CCCCAC3AGAG TCCCGGAOGC CGCGGGTOGG 60 

TGCCCGGCQC GCCGGGOCAT 6CAQOGACGG COGCCGOGGA GCTCCOAGCA GCSGTAGCQC 120 

CCCCCTGTAA AGOGGTTCQC TATGCCGQGA CCACTGTGA^ CXXTTGCCGOC TSCCX3GAACA 180 

ZK) CTCTTCGCTC CGGACCAGCT C34GCX:rCTGA TAAGCTGGAC TCGGC3VCG0C OSCaUiCAAGC 340 

ACCGAGGAGT TAA0AGAGCC GOU^CGGIkG GGAAGQCCTC COCGGAOSGO TGGGGGAAAG 300 

CGGCCGGTGC AOCGOGGGGA CAQGCACTCQ GGCTG6CACI GGCTGCTAGG GATGTCGTCC 360 

TOOATAAGGT GGCATGGRCC CGCCATGGCG CGGCTCTGOG GCrTCTGCTG GCTGGTTGTG 420 

_ _ GGCTTCTGGA GGGCCGCTTT CGCCTGTCCC ACC3TCCTGCA AATGCAQK3C CTCTOSOATC 480 

TGGTQCAGC3G ACCCTTCTCC TOaCATCGTG GCATTTCCGA GATTOGAGOC TAACAGTCTA 540 

QATGCK3AGA ACATCAC06A AATTTTCATC GCAAAfX3U?A AAAGGITASA AATCATCAAC 600 

6AMSATGATQ TTGAAGCTTR TGTGQOACTG AGAAATCT6A CAATTGTGGA TTCTGGATTA 660 

AAATTTGTGG CTCATAAAlQC: ATTTCTGAAA AACAGCftACX! TGCRSCACAT CAATTTTAOC 720 

CGAAACSiAAC TGAOGAGTTT GTCTAGOAAA CATTTCXiGTC ACCTTQRCTT GTCTCAACTG 780 

•iv ATCacrGGTGG GCAATCCATT TACATGCTCC T6IGACATTA TGTGGATCJU^ (zACTCTCCAA 840 

GAGGCTAAAT CCAQTCCAGA CACTCftfiGAT TTQTACTGOC TGAATGAAAO CA0CAAGAAT 900 

ArrCCCCTGG CAAACCTGCA GATACCXAAT TGTGGTTTtSC CATCTGCAAA ItrFOGCCGCA 960 

OCTAACCTCA CTQTGGAGGA AGGAAAGTCT ATCACATTAr CCTGTAGTGT GGCAGG^EXSAT 1020 

^ CCGOTTCCTA ATATGTATTQ GGATOTTGGT AAOCTGGrTT CCAAACATAT GAATGAAACA 10 ao 

AGOCACACM: AGGGCTCCTT AAOGATAACT AACATTTCAT COGATGAGM? TGGGAAGCAS 1140 

A TCTCTT GTG TGGC3GGAAAA TCTTGTAGGA GAAGATCAAB ATTCICTCAA OCICACIGTS 1200 

CATTTTGfCAC CAACTATCAC ATXTCTC6AA TCICCAACXTT CAGACCAOCA CTGSTGCSVIT 1360 

CCATTCACTG TGAAAGGCAA C(X!CAAACX:a GOSCTTCAGT QOTTCTATAA GGGGGCAATA 1320 

TTGAAaXsAGT CX2AAATACAT CTGTACTAAA ATACATGTTA CCAATCaC3iC GGAGTACCAC 1380 

W GGCTGCCTCC AGCTGGATAA TCCCACTCAC ATGAACAAIG GGQACTACAC TCTAATAGCC 1440 

AAGAATGAGT ATGGGAA6GA TGAGAAACAG ATTTCTGCTC ACTTQlTGaa CTGGCCtGOA 1500 

ATTGAOGATG GTGCAAACCC AAATTATCCT GATGTAAlTT ATGAAQATTA TGGAACTGCA ISfiO 

GCGAATGACA TCQQGGACAC CAOGAACAGA AGTAATGAAA TCCCTTCCAC AGAOSTCACT X620 

GATAAAACCG GTOGGGAACA TCTCTOGGTC TATGCTGTQa TOOTGATTGC GTCraTCGTG 1680 

GGATTTT6CC TTTTGGTAAT GCrGTTTCTS CTTAAGTTGG CAAlGAGACIC CAAGXTIGGC 1740 

ATCSAAAQGCC CftGCCTCCQT TATCAGCAAT G&TGATGACT C30CCAGCCC ACT0C3VTCAC IBOO 

ATCT0CAAT6 OGAGTAACAC TCCATCTTCT TOGGAAOGTG GCCCAQATQC TGTCAXTATT IB60 

GGAATOAOCA AGATCCCTQT CSVTTGAAAAT CCXICAGIHCT TTGGCATCAC CAACAOTCAG 1920 

CTCAAGCCAG ACACATTTGT TCAGCACATC AAjGCJQACATA ACATTGTTCT GAAAAGGGRQ 1980 

->U CTAGGGOAAG GA6CCTTTQG AAAAOTGTTC CTAGCTGAAT GCTATAAOCT CXGUCCTGAG 2040 

CAQGACAAGA TCTTGGTGGC AGTGAAGACG CIGAAOGATG OCAGrGACAA TGCACGCAAfi 2100 

GACIXCCACC GTGA0GCOSA GCTCCTOACC AACCTCCASC ATGAGCACAT C3OTC!AAOTTC 2160 

TATGGCGTCT GOGTGSAGGG CGACCCCCTC ATCATGGTCT TTGAGTACAT OAAGCATGGG 2230 

GACCrCAACA AQTTCCTCAG GGCACaCQQC CCTGATGCCG TGCT6AXGC3C TGRGGGCAAC 2280 

3D COGCOCACGG AACXGACGCA GTOGCSVGATG CTGCATATAG OCCAGCAGAT OflCCGOSGGC 2340 

ATGGTCTAGC TGGGQTCCCA GCAGITOGTG CSICCGOGATT Tt3GCCAOCAG GAAdGOCXO 2400 

6TCGGGGAGA ACTTGCTGGT QAAAATCGGG GACTTTGGQA TGTCCOGGGA caTGTACAGC 2460 

ACTGACTACT ACAGGGTOGG TGGGCACACA ATGCTGCCX^A TPCGCTOGAT GCCT0C3VGAa 2520 

^ AGCATCATGT ACAGGAAATT CACGACGGAA AGCJGACGTCT GQAGCCTGGG GGTCQTGrTG 2580 

OU TQGGAGATTT TCACCTATGG CAAACAGOCX! XGGTACCAGC TGTCAAACAA TGAGGTGATA 2640 

CAGTOTATCA CTCAGGGC3C3S ASTGCTGCAG CaACCGCGCA GGTGCCCXX»i GGRBCSLOSAT 2700 

GAGCTGATGC TGGGGIGCTO 6CAGCGAGA6 OOCCACKIQR GGAAGAACAIT CAA0GGCRTC 2760 

CATA CCCI CC TTCAGAACTT OGCG71AGGCA TCTCOGOTCT AOCl^GGACAT TCTAGGCTAO 2820 

^ GGCX: CTTTT C CCCAGACOGA TCJCTTCCCAA CGTACTOCTC AQACGGGCTG AGAGQATGAA 2880 

OD CATCTTTTAA CXGCCGCTGG AG6CCAGCAA GCIGCTCTCC TTCACTCXGA CAGXATTAAC 2940 

ATCAAAGACr CCGAGAAGCT CTCGAGGGAA GCAGTGTGTA CTTCITCATC CASAGiAC3U» 3000 

QTA TTG A qCT CTtTTTGGCSk TTATCTCTTT CrCTCrTTCC ATCICCCTTG OTTGTTCCTT 3060 

TPTCTTTTTr TAAATTTTCT TlTTCTTCTT TTTTTTCGTC TTCXXTGCTT CAOGATTCTT 3120 

TOACCCTTTCXT TTGAATCAAT CTGGCTTCTG CATTACTATT AACTCIGCAT AGACAAAGGC 3180 

CTTAACAAAC GTAATTTGTT ATAXCAGCAG ACACTCCAGT TTGOCCACCA CAACTAACAA 3240 

TGCCITGTTG TATTCCTGCC TTTGATGTOG ATGAAAAAAA G66AAAAOIA AXATIVCACT 3300 

TAAACITTQT CACTTCTGCT GTACAGATAT CGAGAGTTTC SWIGGATTCA CTTCTATTTA 3360 

TT TATTA TTA TTACTSTTCT TATTGTTTTT GGATGGCITA AGCCTGTGTA TAAAl^UUSAA 3420 

^_ AACTTGTGTT CRATCTOTGA AGCCTTTATC TATGGGAGAT TAATkACCAGA GAQAAAGAAG 34 BO 

/O ATTTATTATG AAOOGCAATA TGGQAflaRAC AAAGACAACC ACTGGGATCA GCTGGIGTCA 3540 

GTCCCTACTT AGGAAATACT CftGCAACTGT TAGCTOOeAA GSAAaQTArTTC GSCSlCCrTCC 3600 

CCTGAGGACC TTTCTGAGGA GTAAAAAQAC TACtGGCCTC TOTGOCWIGQ ATGATTCTTT 3660 

TCOCATCACC AGAAATGAIA GCGTGCAGTA OACASaCAAAG ATOGCTT 3 7 07 



80 
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AASAOGGUTT CTCAGACAAG GCTTGCAAAT GCCCX3GCAGC CATCATTTAA CTGC^COOGC €0 

AC3AATAGTTA COGTTTGTCA. CCCQJLCCCTC COGGA.TCGCC TAATTTGTCC CrAQTOAOAC 120 

CCCGAGQCTC TQCCOSCGCC TGGCTTCTTC GTAtaCTGGAT GCATATCGTG CTOCGGGCAS IBO 

CC3GGGG0GCA. GGGCACGCGT TCGCGO^CAC OCTAGCACTiC ATGAACACSC! SCAAQ&QCra 240 

AACCAJU3CAC GQTTTCCATT TCAAAAAGOG AGACMSCCTC TACC3B0GATT GTAGAM3AGA 300 

CTGTGGTGT6 AATTAGGGAC CGQOAGGCQT OGAAOQGAOCS AAC9GGTTCAT CTTAGAGACT 360 

AATTTTCTGG AGTTTCTGCC CXTTGCrCTGC GTCAGCCCTC ACGTCACTTC GCCAGC3\GXA 42 0 

GCAGAGGCXX5 0G6CGGCQGC TCOT3GAATT GOGTTGGAGC AGGAOCCTOS CTOGCTGCTT 480 

CQCrCQCGCT CTACGGGCrC AGTCCCOGGC GGTAGCAGGA GOCltSGACOC AGGCGQCQCC 540 

GGGGOGOSTG AGGCGCGOGA QCCCOSCICTC GAGQTGCATA CC39GACCCC3C ATTGGCATCT 600 

AACftAGGAAT CTGC6CCCCA eAGAGTCCOS GSAGGGOCGC C3GGXCGGIGC CCGQCGCGCC: &€0 

OGGCCATGCA GCGRCGGCCG CCGCGGAGrr COQAaC!AGCG QTAGCGCCCC OCIGTAAAGC 720 

GOTTCGCTAT GCCX3GGGCCA CTGTGAACCC TGCCGCXTTGC CQGftAGACTC TTCGCTCCOG 790 

ACCAGCrCAG CCTCTBATAA GCTOQACTOB GCAOSCCCaC AACAAGCACX: GAGGASTTAA 840 

GASAGGOGCft AGOOCAGGGA AGGCCTCCCC GCAOSGGTGG GGGAAAGC06 CCXSGTGCAaC 900 

GCGQGGACAG G<ACTC6GGC TGGCACTGGC! TOdKOGQAT GTCGTCCTGG ATAAGGTGGC 960 

ATGOACCOGC CATGGOGCGG CTCTGGGGCT TCTGCTGGCT GGTTGTQGGC TrCTGOaOGa 1020 

CSOGCTTTCGC CT6TCCCRCG TCCTGCAAAT QCAQTQCCTC TaSQATCTGG TGCAGOGRCC 10 aO 

CTTCTCX:T0G CATOGTGGCA TTTCOGAGAT TGGAGOCTAA CAGTGTAGAT CCXGAGAACA 1140 

TCAOCX3AAAT TTTCATCSCA AROCASAAAA GOTTAQAAAT CATCAAGOAA GATGATGTTG 12 OO 

AAgCTTATQT GGGACTGAGA AATCT6ACAA TTGTGGATTC TGGATTAAAA TTTGTGGCTC 1260 

ATAAAGCATT TCH^MiMiAC AGCAACCTGC AGCACATCAA TTTTACXTOGA AACAAAGTGA 1320 

CGAGTTTGTC TAGGAAACAT TTCCGTCAOC TT6ACTTGTC TGAACTGATC CTGGTGGOCR 1380 

ATCCATTTAC ATGCXCCIGT GACATTATGT GGATCAAGAC TCTCCAAGAO GCTAAATCCA L440 

GTCCAGACAC 7CM3GATTTQ TACTGCCTOA ATGAAAGCAO C3JU3AATA1T OCOCXGG<3kA LSOO 

ACCXCCSGAT AOXaVATTGT GGTTTGCCAT CTGCAAATCT GGOOGCACCT AACCTCACTG 1S60 

TGGAGGAAGG AAAfiTCTATC ACA^TPATCCT QTAGTQTQGC AtSGTOATCXX? OTTOCTAATA 1620 

TGraTTaaOA TGTTGGTAAC CXGGTTrCCA AACATATGAA TGAAACAAGC CACACACaOG 1680 

GCTCCTTAAG GATAACTAAC ATTO^CATGCQ AIGACASTGG GAAGCAOATC TdTGTGTGG 1740 

OOGAAAATCT TGIAGGAGAA QATCAAQATT CTGTCAACCT CftCTGTGCAT TTTGCACCAA 1800 

CTATCACATT TCTCGAAIC7 CCAACCTCAG ACCS^GCftCTG GTGCATTCCaV TTCACTGTGA 1860 

AAGOC31AOT7 CAAACCAGOG CTTCAGTGGT TCTATAACGG GGCAATATTG AATGRGTCCA 1920 

AATACATCTG TRCTAAAATA CATGTTACCA ATCACACGGA OTACCACGGC TGCCTOCAGC 1980 

TGGATAATCC CACTCACA.TQ AACAATGGGG ACTACACTCT AATAGCXSU^G AATGASTATQ 2040 

GGA»GGaaQA GAAACAI^TT TCIGCCCftCT TCATGGQCT6 GCCieGAATT QAOSATGGTG 2100 

CAAAlGCaUUk TTATCCTOAT QXAATTTATG AAGATTATGG AACTGCAGOS AATGACATC3G 2160 

GOGACACCAC GAACAGAAGT AATGAAATCC CirTOCACAGA CGICACTGAT AAAAO0G6TC 2220 

GQGflACRTCT CTQGGTCTAT GCTOTGGTGG TGATTGOGTC TGTGGTGGGA TTTTGCCTTT 2260 

TGG7AATGCT 6TTTCIGCTT AAGTTGGCAA GACACTCCAA GTTTGGGATG AAAGGTTTTG 2340 

TXTUGXTTCA TAAOATCCCA CVSGATGGGr AGCTGAAATA AAGGAAAAGA CAGAGAAAGB 2400 

GGCTGTOGT6 CTT G TTGGTT GATGCIGCCA XGEAAGCIGG ACTGCTGGOA CTGCTSTTCG 2460 

CTTATCXfC3GO GAAGTGCraC TTATCTGQQG TTTTCTGGTA GATGTGGGCG GTXSrl-rGGAS 2520 

C?CTGTACTAT ATGA7U30CTG CATA3ACTGT GASCIGlGAT TGGGGAACAC CAATGCASAG 2580 

GTAACZTCTCA GGCAGCTAAG CAGCACCTCA AGAAAACATG TIAAATTAAT GCTTCTCTTC 2640 

TTACAGTAGT TGAAJOACAA AACT6A2iAXG AAATOCCATT GaAXTOTiACr TCTCTTCIGA 2700 

AAAGTGTGCT TTTTGACOCT ACTGGACATT TATTGACTTA ATIGCTTCT6 TTTATTAAAA 2760 

TTGACCTGCA AAGTTAAAAA AAAAfTTAAAG ^Il^GAGAACnfi GTATAAGrTGC ACACTGAATA 2820 

GTCTAATCTA CATGTAACAC ATATTTTAGT GTGATTTTCT ATACTCTAAT CAGCACTGAA 2B80 

TTCAQAGGGT TTGACTTTTT CATCTATAAC AC&STGACTA AAAGABTTAA GGGTATATAT 2940 

AOCATObCEI TOSGACIIOa TAOIATTKTT AAAa0GTTAT TTCCITCACT 6ICAATAAAA 3O0O 

GTCCAAATGT TTAGCTTAGG TCIGAGAGTC AAACAAT6TT AAGC^^TTGTC TXAAAGTTOC 3060 

TTAGOCAGCA AAACAAAAiCA AAACAAAACA lUkCAAATGAA AAACGTT7AA AAAGAAGAAG 3120 

AAGAAAAAAA ACAAGAACAA GCAGCAACAG CTGTTTTGTT QGGGCTATAG ATTTAAGTTA 3180 

GGCATAGTCA ATITCAGAAT AACIAAGAGT GGAAO^ATATG CAXKTQGTGA AATIATAACC 3240 

VSGCCCt^Vr TTATTTGtXSC TCTGOGATCC ACCTOCTTTT TAGAAGTCI6 CCGAGaentaA 3300 

AOGOCACAGT ATCTCaUEGCr GTTTGCATTA CAGAAClGCA GCTTTTCTAC TCTGAAAA0S 3360 

CCTGGGAGCA GAATGGCrGG CCTGCTGT(SV GGAGQAGAG6 AGATTCIAAG AA6GATAQTC 3420 

CCCC3CTACAA CATACTGTCA TACXGCTGGG TTITCATGGG TASGAAAGCT TGTCCTGACC 3480 

CCAGCASC7A AGAGGTOQCA GOTCSCrrAAT OAAT&TATGC TTTATAATGT GCTTCTTCAT 3540 

TGCTGR GauaO qCftgO CITOS AGCTGZGGAT TTCTGCATOC CCCCIGAGTC TQACQCSITGG 3600 

ACACCTGTTT CAXTCACTTT AGCATCACStfa TGAOCTTTGT ATGCTCTGTT CAGTCrUTGT 3660 

CAGGCAGTRT GCTTGTOCTG AAGAGAGGTT TGGCTATCCJC CACCOCACOC CAOCCCACCX: 3720 

TOTTCSCTTTT TTATCAGGAG GACTTC3VGAG CC3M3GCCTGC AfiCJtrrTTGT TTGAAAACAC 3780 

AATCAGCICT GACAGTTAGA CATGCACACA QACGCQWTAG CTGGATTGGA AACAII^Tt^ 3840 

TTZTAAAAAT TTATTTTTTT TGGAAATA6T TGCACAAAT6 CEGCSUVrXTA GCTTTAAGGT 3900 

TGTATA6ATT mAACIAST CCAACACftGX CaO^AAOKCT OTTTXCaUkTC GTCTGTAAAC 3960 

CAAGGCATTA ATCTTAATAA ACCAGGATCC ATTTAGGTAC CACTlCaVTAT AAAAAGGATA 4020 

TCCATAATGA ATATTTTATA CTGCATCCTT TACATTRGCC ACTAAATACG TTATTGCTTG 4080 

ATGAAGACCr TTCACAGAATT CCTATGGATT GCAGGATTTC ACITGGCZAC TTCATACOC& 4140 

l:GCC:TTAAAe AGGGGCAGTT TCTC^AAAAGC AGAAACATGC OGCCAGTTCT CAAGITTTOC 4200 

TOCTAACTCC ATTTGAATGT AASGGCAGCT GGOCCGCAAfT GlGGGGAaGT OOGAACATTT 4260 

TCTGAATTCC CATTTTCTTQ T^OGOGGCTA AATGACAGTT TCTGTCATTA CTCAGATTCC 4320 

GATCTTTCCC AAAGGTGTTG ATTTACAAAG AGGOCfiGCTA ATAGCAGAAA TCRTGAOCCT 4380 

GAAAGAGAGA TGAAATTCAA GCXGTOAGCC AGGCAGGA6C TCAGTATGGC AAAGGTTCTT 4440 

GAOAATCAGC CATITGGTAC AAAAAAGAXT TTTAAAGCIT tmtGIIAXA CCATGGAGCC 4500 

ATAGAAAGGC TATGGATTGT TTAAGAACTA TTTTAAAGTG TTCC3UaA£X:C AAAAAGSAAA 4560 

AATAAAAAAA AAGGAATATT TGTAOCCAAC AGCTAGAAGG ATTGCAAGGT AGA<ITTTTGT 4620 

!CTTAAAA*IGG AGAGAAOTGG ACACATAAGG OCATTTAATA TATCAA2&QAT OhGTTGACAT 4680 

CTCCrAOQGA ATOATGAAAA CAGCAGGCTA T 4711 
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5ATGCCACTGA AGCATTATCT CCTTTTOCXG GTGGGCTGCC AAGCCTGGGG TGCAGOSTTG 60 

GCCTACCATe GCTGCGCTAG OG3W3TGTACC 'TaCTCX::AGG6 CCTCCCAGGT GGAGTQCACC 120 

GGGGCMMCA TTGTGGCGGT GCCJCACOCCT CTGCCCTGGA ACGCCATQftG CCTGCAGATC l&O 

CTCaACACGC ACATCACXQA ACTCAAT6AG TCCCCCJTTCC TCAATATCTC JiflCCCTCATC 240 

GCOCTGAGGA TTGAGAAGAh. TOAGCTGTCG CGCATCACGC CTOOGGCCTT CC6AAACCTG 300 

GGCTOSCTGC GCTATCrCAlB CCTOSCCftAC AACAAGCTGC AGGTTCTGCC I2ATCGCCCTC 36Q 

lU TTCCftOtSGCC TGGACAGCCT TBftGTCTCTC CrTCTGTCCA GEAACCAGCT GTTGCROATC 420 

CAGCCGGCCC ACTTCTCKCA GTGCAQCAAC CTCAAtSGAGC TGC3W?rTC3CA CGGCAACCAC 4aD 

CTGC3RATACA TCCCTGACGG AGCCTTCGAC CSiCCTGGTAQ OACTCACGAA GCTCAATCTG 540 

OTCftAGAATA QCCTCACCCA CATCTCACCXZ AGGGTCTTCC AGCACCTGGQ CAATCTCCAG COO 

^ ^ CTCCrCCGGC TGTAl'GAiGAA CAGGCTCAOG GAIATCCCCA TOGGCACTTT TGATGQQCTT 660 

LD GTTAAOCTPGC AfSGAACTGGC TCTACAfiCAfi AACCAGATTG GACTGCTCTC CXCXGGTCTC 720 

TTCX:aCMCA ACCACTJVCCT CCAQAOACTC TACCrGTCCA ACAACCACAT CTOCCaOCTO 7&0 

CCACOCAGCA TCTTCATGCA GCTGOCCCAG CTCAACCGTC TTACTCTCTT TQC3GAATTCC £40 

CTGAAGOMC TCTCTCTGGG GATCTTCJSGG CCCATGCCCA ACXITGOGGGA GCTTTGaCTC 900 

TA TGftCR nCC ACKTCTCrrC TCTAGCOGAC AATQTCTTCA GCAACCTCCe CCAGTTGCAG 560 

ZU GTCCTOATTC TTAGCGGCAA TCAQATCAGC TTCATCTCOC CGeCTGCCTT CAACGGGCTA 1020 

ACXSGAGCTXC CSGCSAaCTiOTC CX;T0C:ACACC AAOQCACTGC AGGACCTGGA CGGGAATGTC 1080 

TTCCGCATQT TGGCCAACCT GCAQAACATC TCCCTGCAGA ACAATDGCCT CAGACAGCTC 1140 

CCAGGGAATA ICTTCaCCAA CGTCAATGGC dCATGOCCA TOCftGCTGCA GAACAACCAG 1200 

CTGGAGAftCT TQCCCCTGGG CATCITOGAT CftCCTGGGGA AACiaTGTGA GCTGOGaCTG 1260 

ZD TATGACAATC CCIGGAGOTQ TOACTCAGAC ATCCTTCC3GC TCOGCAACIG GCTCCTGCTC 1320 

AAOCAGCCTA GGTTAGGGAC GGAiCACICTA CCTGTGTGIT TCAGGCC3U3C CAATGXCC3GR 1380 

GGCXZAGTCOC TCATTATCRT CAATGTCAAC GTTGCXGTTC caAGCGTCCA rGTCCCTGAG 1440 

GTGCCTAGTT ACCQUaAAAC ACX:ATGGTAC CX3U3ACACAC CCAGTTACCC •naACACCftCA 1500 

^ _ TCOerCrCTT CZACCACTGA GCTAACCAGC OCTGTGGAAG ACTACACTGA TCTGACTACC 1560 

jV ATTCAGGTCA CTGATG2U2Ga CAQCGTTTGG GGCATGACCC AGGCCCAGAG OGOGCTGGCC 1620 

ATTGCOGCCA TTSTAATTOG CATTGTOQCC CTGGCCTGCI CCCIGGCTGC CTGGGtOOGC 1680 

TGTTGCrrGCr GCRAGAZUaAG QAGCCAAQCT 6XOCTGATGC AGATGAAGGC ACCGAATGaS 1740 

TGXrAA 1746 

35 Seq ID HOz C112 VtStH Sequence 

nucleic Add Acceasioa #b J2Kj003€58.1 
Coding sequence: 77,. 1372 

An } 21 31 41 51 

GTCCOCJGCAO CGCOGTCGCG OCCTCCTQCC QCAGGOCACX: GAfiQOC3QCCG COGTCTAQCQ 60 

CCcajACCTC GCCACCATOA OAGCCJCTGCT GOaaCGCCTQ CTTCTCTGCG TCCTQOTOGT 120 

GAGOGACTCC AAA0GCASCA ATGAACTTCA TCAAGTTOCA TCOAACraTa ACTGTCltaAA X80 

j,^ TOQAOaanCA TGTGTOTCCA ACAASTACTT CTCCAACATT CACTGGTGCA ACTGCCCAAA 240 

4 J GAAATTCZGGA GOGGAGCACT GTGAAATAQA TAAGTCAAAA ACCTGCTATG AGGGGaATGG 300 

TCACTTTTAC OGAQGAAflGG CCA6CACIGA C3U3CATGGGC OGGCCCTGCC TGCXXTGGAA 360 

CTCXGCCACr GTCCTTCRGC AAAOOTACXA TGCXX3«aGA TCTOATOCTC TTCAGCTQGG 420 

CCTQGGGAAA CATAAlTftCI GCAGGA&COC AGACAACCGG AGGOGROCCT QQTGCTATGT 480 

GCAGGTGGGC CCAAAGOGGC TSKSTOCAAGA QTGCATGGTG CATGACTGOQ CRGATGGAAA 540 

AftaG OOCT CC TCTCCTCCAB AASAATTAAA ATTTCftOTOr G6CCAAAAGA CTCTOAGGOC 600 

CCGCTTTftftG ATTATTGGOG GAGAATTCAC GACCATOGAG AACCAOC3CCT GGTTTGOGGC 660 

GATCTACROa AGGCAGOGGG GGGQCTCrGT CACCTAC3GT3 TOTGGAGGCA GCCPCATCAG 720 
CCCZTQCIGG GTGATCAQOG OCACACACTG ClICArTOAT TACGCAAAQA AOGAGGACTA 



80 



780 



f f CATGGZCXAC CTGGGTOGCI CAAGOCTTAA CICGAACAiC98 CaUkGOgGAGA TGAAGXTTOA 640 

GGTGOAAAAC CTCSVXCCCAC ACAAGGACFA CJUSGGCOXSAC AOGCTTGCTC AOCACRAOGA SOO 

CATTGCCITG C^TGAAGATOC GTrCCAAGSA GOGCAGGTGT GOGCAGCSCAT COOQGRCTAT 960 

ACAGADCATC TGCCTGOCCT CGATGTATAA OGAn-OCCCAG TTTGGCACaA GCTGTGAGAT 1020 

CACTGGCTTr GGAAAAGAGA ATXCOACCSQA CIATCTCTAT OGQGaGCAGC TCAAAAIOAC 1080 

TGTTOTGAAB CTGMTTOCC ACCSGOanOTG TCAGCAGCCC OUCTACTACG QCTCTQAAGT 1140 

OU GACCACCAAA ATGCTATQ3G CTGCTGACCC CCAAOGGAAA ACAGATTOCT GCCAGGGAGA 1200 

CXCAGSGGGA CCCCTCGTCT GTTCCCTCCA ACGCOGCATG ACTTTGACTG GAATTGrOAG 1360 

CTOGGGCCGT GGATGXGCCC TC2AAQQACAA GOCftGGCQTC TACACGAGAG TCTCACftCTT 1320 

CITACCCTGG ATCGGCAGTC ACACCAAGGA ASAQAATGGC CEGGOOCTCI GAGGGTGCOC 1380 

AGGGAGGAAA GGGGCACCftC COGCTTTCTT GCIGGTTGTC AITTTTGCAQ lAGAGZCATC 1440 

OD TOCRTCAGCT GTAAGAAQAG ACTGOGAaSA TAGGCTCTQC ACTGATGGRT TTTOCCTCIGG 1500 

CACCACCAGO OTGAAOGACA ATAGCTTTAC CCTCAOGGAT AGGCXTTGOGT GCTGGCnSCC 1560 

CAGAOCJCTCT GGCCAGGRTG GAGGGOTeGT CCTGACTCRA CATGTTACTG ACCAGCAACT 1620 

TGTCTXTTTC TGGACIG3UUS OCTGGAGGAG TTAAAAAGG6 CAC3GGCAXCI CCTGIGCATG 1680 

GGCICBAAGS GAGAlSCCSlGC TCC9C3COQAOC GGXQGGCATT TOTGAOQCCC AIG6TA3AGA 1740 

/U AATGAATAAT TTCCCAATTA GGAASXGTAA GCAGCDGAGG TCITCXTGAGB GAGCITAGGC 1800 

AATGTGGGAS CAGOGGTITG GGGAC3CAGAG ACACTAACGA CTTCAGGGCA GGGCTCTQAT IB 60 

ATTCCATQAA TGTATCAGGA AATATATATG TGTGTGTATa TTTGCACACT TGTTGTGTOG 1920 

GCTGTGAGTG TAAOTGTQAG TAAGAGCTGG XGTCTOATTG TTAAGTCTAA ATATTTCCTT 1980 

AAACTGTGTG GACXGTGATG CCACZlCftGAG TGGTCTTTCT GSAGAGGTTA TAGGTCSUrrC 2040 

'-5 ClGGGGCXnC TTGaOTOCOC CACQTGACAG TGCCTOOGAA TGTACTTATT CTGCAGCArrG 2100 

ACCarSTGAOC AGCACTGTCT CAGTITCACT TTCACATAGA K5TCCCTTTC TTQGCCAQTT 2160 

AlCCCTTCCT TTTAGCCTAG TTCATCCAAT CCTCACTGOQ TQ6GGTGAC36 ACCACTCCTT 2220 

ACACTGAATA TTTATATTTC ACTATTTTTA TTTATATTTT TGTAATTTTA AATAAAAGIG 2280 

AXCAATAAAA T6TGATTTTT CTGA 2304 



Seq ID NO: C113 DNA Sequence 

BTucleic Acid Acces&ion #i XM_087254.1 

Coding sequencer 47. .2332 
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I " 21 31 ai sx 

AGAQTACSSXG TTTACAGATA AAACTGGTAC ACTGACAGAA AATGAGATCC AQTTTCGGGA 60 

AT6TTCAATT AATGGCATGA AATACCAAGA AATTAATGOT AGACTTGTAC CCQAAGGACC 120 

iSS?'^^^^ TCTTCAfiAAG GAAACTTATC TTATCTTAGT AGTTTATCCC ATCTTAACAA 180 

nai^I^f^'^ CTTACAACCR GTTCCronT CAGaACCR<3T CCTGAAAATG AAACTGAACT 240 

AATTAAAGAA CATGATCTCT TCTTTAAAfiC AGTCAGTCTC TGTCACACIG TACAfiATTAG 300 

CAATGTTCAA ACTGACTGCA CXeSTGATGG TCCCTGGCAA TCCAACCTGG CACC^TOGCA 3 60 

m SI^^^^ TATGCATCTT CADCAGAISA AARGQCXCTA GTAQAAGCM CTGCAAQGAT 420 

TOGTATTGTG TTTATTGGCA ATTCTOAAGA AACTATGGAG QTTAAAACTC TTOGAAAACT 480 

AAACTGCTTC ATATTCTCGA ATTTGATTCA GATCGTAQGA GAATGA3TGT 540 

AATTGTTCAG GCACCTTCAG GTGAGAAGTT ATTATTTGCT AAAGC5AGCTG AGTCATCAAT 600 

TCTCCCTAAA TGTAl-AflGTG SAGAAATAGA AAAAACCAGA ATTCATGlAa ATGRATTTCC 660 

1 5 SlSJi^^ TOTSTATAGC ATATAGAAAA TTTACRTCftA AflOACSTATGA 720 

i:? GGAAATAGAT AAAOGCATAT TTQAAGCCftC GACTGCCrrG GAQCAGCGGQ AAGflSAAATT 7BO 

GGCAGCTGTT TTCCAQTTCA TAGAGAAA<3A CCTQATATTA CTTOGAGCCA CAQCAGTAfiA B40 

^S^i^™ CAAC5ATAAAG 1TCQAGAAAC TATTSftAGCA TTGaOAATGG CTGGTATCAA 900 

^t^^ CTTACTGGC3Q ATAAACATGA AACAGCTGTT AQTGTGAGTT TA1-CATGT03 960 

in 5^"^^'^ WSAACCATSA ACRTGCTTGA ACXTATAAAC CAGAAATCftG ACAGCGAQTC 1020 

n^r^?^ TTG«3GCAeC TTGCCftQAAG AATTACRGAG QATCATQTGA TECAGC^ lOBQ 

^TGGCIAOCA GCCTATCTCt TGGACTCAGG GAGCATCAAA AACTATTTAT XWO 

OaMGTTTGC AGAAATT6TT CS^GCTGTATT ATGCTOTCGT ATOGCTCCAC TGOVGAAAGC 1200 

AAAAOTAATA AOACTAATAA AAATATCACC TOAGAAACCT ATAACATTOG CTCrXGQTGA 1260 

25 I^^t^ IJGATftCAAflA AGCOCATCTT GQCATOOSAA TCAiqCGTAA 1320 

CTATGCAATA GCCRfiATITA AQTTCCTCTC 13B0 

SiJS^^SSr^ TTTGTTCATG QTCATTTTTA TTATATTftGA ATAGCTACCC TK^TACAGTA 1440 

TTTrrTTTAT AAQAATGTGT GCTTTATCAC ACCCCAGTrT TTATATCAGT TCTACTGlTr ISOO 

GTrTTCTCAQ CAAACATTGT ATGACAGCGT OTACCTGACr TTATACAATA TTTGllTTAC 1560 

TTCOCTACCT ATTCTGATAT AiaGTCTTTT GOUICROCAT GTAGACCCTC A'PGrOTTACA 1620 
titlf^^ ACCCTITATC GRGACATTAG TAAAAACX3GC CTCTTAAGTA TTAAAaSS 

rrCTTTATTGG ACCATCCTGG QCTTCASCCA TGCCITTATT •rTCTTTTTTO GATCCTATTT 1740 

^^J^^ CTCTGCTTOG AflATGGCCAa ATGTTTGGAA ACTGGACATT 18QD 

<5TCTTCACM TCATGGTTAT TAChGTCACA GTAAAGATGG CrCTGGAAAC I860 

35 ISJ^^^!^ fiiS^^^ AOCATCTOGT aaCCTOGGOA TCTATEATAT TTTATTTTCJT 1920 

J3 ATTTTCCTTG TTTTATGGAG OGATTCTCTO GOCMTTTTO GOCTCKCftGA ATATQTAm IdflO 

^l^^l^ CAAGTGGTTC ^IGCTTCGITT GCCATAATCC TCATGffrrGT 2040 

XACAT6TCTA TTICTTGATA TCATAAAGRA GGTCrmSAC OOACACCTCC ACCCTACAAG 31 Oft 
TA^AAG GCACAGCTTA CTGAAACAAA TGCAGGTATC SSSS^ aS^SJS^ 

40 S^I^^^ GRAGSW3AAG CftGC»T©a?ac ATCT6TXOGA AfiAATQCTGG AACGAGTmi 222^ 

AGTCCAACCX: ACATa«3C»G ATCaTGGAGT GCATCQGRTC ClTTCTAmc 2280 

CAACGACAG0 AGCATCTTGA CTCTCTCCAC AATGGACTCR ICTACTTOTT AAASGGGCAfi 2340 

I^S^^ ^^^^ TTCACCTCCT TTCCTAAAAT TCAGTOTGAT CACOCWTTA 2400 

TAGCTCTGAA ATTAATT^CC AAAATCTTC TAGiaWSTTO TACCCRCTCA 2460 

. GAGTTATAAT GGCAAACRAA CRGAAAGCAT TAOTACAAflC CCCICCCRAC ACOCtS^ATT 2520 

i^SSJSSS^ CATOTTAAAA TTTG&fiAATA AAGAGAGATT TTTCAICTCT rEGTCTCGTT 2530 

TG^CCTTOT GCTTKMOGA CTCCTAAIGG CATTTCAfiTC TCSTTGCTGAG GCCAii^TAT 2640 

TTiaMATRA ATGTAGAAAA AAGAGflGAAA TCTTAGTAAA OAGTAITTTT TAGTATTAGC 2760 

TTWaiTAJTG ACTCTTCTAT TTAAATCTGC TTCTGTAAAT TA113CTBAAA QTTTGCCTTS 2760 

^0 ^S^SS^Sr Tmi!?"^ OAGITATATT TAAAOCTTTT CATGOSAAAA GTTAATGTGA 2820 

MACTQAGGA AiTTTTGGrCC CTCAGT^C TGTOTTGmA ATTCATTRAT QCATtSgS 2880 

TTO^ftGAGC AAATTAQGRG AATCATT^ AACCATTATT TACTGCAGTA T^^?^ HTq 

ATTTATACCA ATTCCTCTAA CTGTACTOTA AC51CAG0CTO TAAAGTTAGC CATATAAATO 3 0OO 

CAAG^ATA TCATATATAC AAATCAGGAA TCftGGTCXOT TCPJ^C^ T^TT^ Hsl 

^1^*^'^ ATTTTTGTGA CAGaBlATAA AfiAOCCTATA dIGGGTAAAT mSaSaT lllo 

' r^™'^'' TTAATTTAAT GTCTTTATCA TTGGAXCTTT TOCATGCTTT AAtSgS^I lllo 

S?3^!?r^t S^r'^ ATATCRATAA AAAGTrTQQA CAQTATTTAA ATKTTQCRAA 3300 

J^r^^ TATACAAATC AGAATAfiTAT GGGXAATTAA ATCSAATftCAA AAASR^flC 3260 

J«3CX:GACrTA GACATGCTCT ttCJCCXTTCTA TAaGCTAOAT TTTO^A^ 3420 

^ISSS^ iSSS^ GTGTGOCAGT ACAAGGCATA TTTCAGQTGT GGCIGTCGAA 3540 

ATCAGGTAAT GTTAGCAATA AATTAAATGC TAAGAATGAT 3600 

I^JSSS^ CATGITACTG TAATTAACTC ATW3CACTTC AAAACCtAAC TXCCMCCTG 3660 

AATTXATCAA GTAGTTCftQT ATTGTCATTT 6TTTTTGITT TATTGAAAAG TftMOTTHTr InZn 

65 TTAAOATTTA GAAGIGATTA -ETAGCTTCAG AACERTTACC ^CTCIAAO S^SS^ llH 

TTCTATACAT ATTAAGATAA TGGTTAAATC COQTXTTACC A^Stto^ S^^?^ lHo 

^2^"^''^ TTGTGCAGCC CTAAGCTTCC TlTCxrCATTTC ATCAATATAA 3900 

GGCTTCTAGIR ATTGGACTGG CAGGQGAAAO AATGGTAGAG ACRGAAATTA AGACTTTATC 3960 

70 S^I^S; TATTTTCTTG ClAAOSEAAC ATTIGTctS SSStc 402^ 

AAGTTATTAA GCTAAATATT AATTTTCRAA AATAarOCTT cSWSi 4080 

SISSS^ fi^^r" ^^'^^^^ GTIATTCI^ AAGTACTAAA aS^Sa lllo 

OjAOraaCftA TOTCTGCATT C3VCTAATTCA TGTTCCAGAA GAGGAAATAA TGAAGATATA 4200 

S^SJS?*^ TACTW3GTGG GftGQATATfiQ AAATTTGCnX! ATAAAATCTC TTATAaSS «60 

7^ ^SS?^^^^^ AAAATGACAC CCMSTAGGCC TGCATTACAT TTAdOGAOC OTGrrraiTT «3n 

/5 GCCATCAAAT AAAClGAQTA CTCACACCM ACAAAOftCrC CRAABlS^ ^S^JJ tlln 

TGAJ^TG CAGCftAGACA GGAGGTCAGC TCGCCTATAA SgSSaA ^S^TO till 

ATQXAMTTT CTGTACTCRC CATTTGAAQT TAGTTAAGGA GAACITTATT TTlTT^AAA 4500 

AAGTAAATQS CAACCaCTAG TQTGCTCATC CTQAACTaTT ACTCCRAATC CACTCOGTTT till 

80 nl^iS^ ATTATCTTGT GATTTTAAGA AAAGW3TTTT CTaSSSsS ^^SS^ «IS 

CAATGCAGTC TQCAAGCTTT CftGTAGTTTT CrASTOCTAT ATTCATCCTG XAM^CTT ImS 

ACTAOQTAAC CAGTAATCAC AAGGAAAGTG TC0CCTTO3C ATATricT^ SSJ^SJ lylt 

^^^^J^!^'''''^''^'^^^^^^ till 

QCTAAATAC3G TTATTGCTAA TCaGTGGTCT CSiAATOQATT TGCCTCCCXT TX3CClX33TCr 4860 

GAGOeCTGTA AGCKTOAAGA TAGTOGcaAG CACCAAGTCA GTTTOCAAAA TTGC3CCCTCA 4920 
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QCTGCTTTAA GTGACTCAGC AOCCB3CCTC A6CTTCAGCA GGGGTAGGCT CTUXTCTOGGC 4980 

GGAGCAAAGT ATQQGCCftGG GAGAACTACA. GCTACGAAGA OCTGCTGTCG AGTTGAGAAA 5D40 

AGOGGAGAAT TTATG(?TCTa AATTTTCTAA CTGTOCTCTT TCTTGGGTCT AMGCTCATA 5100 

_ ATACACARAG GCTTCCAGAC CTGAGCCACR CCChGGCCXrr ATCCTGAACA GGAfiACTAAA 5160 

3 CAQAGGCAAA TCAACCCTAG QAAATACTTQ CATTCTQCEC TACOffTTAGT ACCAOGACTG 5220 

AGGTCATTTC TACTGGAAAA GATTGTGAGA TTGAACTTAT CTGATCGCTT GAfiACTOCTA 5280 

ATAGGCAGGA GTCAflGOCCA CTAGAAAATT GACAGTTAAC AGCCAAAAGT TTTTAAAATA 5340 

TGCTACTCTG AAAAATClCG TGA7«3GCTGr AGQAAAAtJGG AGAATCXTCC ATQTTGGTGT 5400 

TTTTOCTGTA AASATCftGTT TGGGGTATGA TATAAGCAGG TATTAATAAA AATAACACAC 5460 

lU CAAA6AGTTA GGTAAAACAT GTTTTATTAA TTTT0OTCCC CACXSITACaGA CATTTTATTT 5520 

CTATTTTGAA ATQAQTTATC TATTTICATA AAAGTAAAAC ACTATTAAAQ TGCT6TTTXA S580 

TCTGAAATAA CTTGAATCTT GTTCCrATAA AAAATAGATC ATAACTCATS ATATGTTTGT 5640 

A ATCA TGGTA ATTTAQATTT TTATGAGGAA TGAGTATCTQ GAAATATTGT AGCAATACTT S700 

- ^ GGTTTAAAAT TTTGGAOCTG ASSACACTCSTQ QCTGTCTAAT GTAATCCXTT AAAAATTCTC 5760 

ID TQCATTGTCA GTAAATGTAG TATATTATTG TACAGCTACT CATAATTTTT TAAAQTTTAT 5820 

GAAGTTATAT TTATCAAATA AAAACTTTCC TATAT 3855 

Seq ID NOi CI 14 DNA Sequence 
Nucleic Acid AccCBBion ft: XN_087461,1 
/A) Cbdii]^ seqiidnce; 236«.1138 

1 11 21 31 41 51 

111)11 

CXXSGCOCCGO GGG0G0CX30G fiAACCCCAAA CnCAACtXSGG TCTOGAGGGA TCCXCXMGCC 60 

OMSCCPiGCCG CCGTCACCX3C CTCOSOSOCG CCCCTGCGGG CITOGCAGGC GCCCGGCGCQ 120 

COCJGCACTQC GCC0G0C06C GGGCICCCX3C GGTCCCACOQ TGftGC3X36CC GSCCGaTOGC IBO 

caacTCGccA tgcaaccsgcc acaoaccrca cgogcgtass ggcccigccqc aggocatoct 24o 

GOCXXnXSCTC GCCXacnCTCC TGGCOGCOSC CTGOOCGCTG CCCCCCGTCC GCXSGCOQGGC 300 

CGCGQACGOG CCXX3QCCTCX: TCGGGQTGCC CTCCAATGCT TCAGICAACQ CGTCCTCCGC 360 

OGCXSAGCCCA TCGCXXZCQCO GCTGCTGGOC TCGGOGGCCC CXOTC30CC0C GGAfiCJOCCaS 420 

GGCCCXaSABQ AGGCGGOS6C (SOOCSGCOlQCa CCT^nOCAAC ATCAGGGTGC ASOGGCAGAT 4&0 

GCTGAGCTOG CTGCTCCaTGC OCTGGGGCGG COCGGQGGGC TTCCRSJTQCG ACCTACTOCT 540 

CTTCTCCACX: AAOSOGCAiCG GCOGCOCTTT CTTGGCaSCC GCCXTOGSkCC GCGTCXSGGCC £00 

_ « QCOGCTGCrC ATC5GAQCACC TGGGGCTGGC GGOSGGCGGC GCOCRGCAGG ACCTGCQCCT 660 

J 0 CTGCQTGGGC TGC3GGCTGGG TGCeOSGTCQ CCQCACGGGC OGOCICCGGC CX3BCCGCCX3C 720 

□CCCAGOGCC GCCX5CCQCCA C06G0GGGSC GCCCACOGCQ CTBGCAGCCT AGCCCacGGC 780 

CXSAGCX3GC0C GGGCCGCTGT GGCrGC3U3C38 CGAGCOGCTG CATTTCTGCT GGCrASACTT 840 

CAGCCTGGAG GAGCTGCAGQ GOGRGOCGGG CTGGGOGCTQ AACCQTAAGC CCAlTGAaTC 900 

CAOaCTGaTO GCCTGCTTCA TGACCCTGGT CaTOGTtSGTG TGGAGOGTQa CX^GCOCTCAT 960 

W CTGGOCGGTG CCCATCATCXJ OCGGCTTOCT GCCCSiAOSeC ATGSAACRGC GOCQGAOCAC 1020 

OGOCAGCACC ACOGCAGCCA CCCCCJGOCSaC AeTTOCCCGCR GGGACCM3CG C3MK30SCX3GC 1080 

OGCCGOCGCC GCTGCCGCCX3 OCGCCX306GC OGTCACTTC3G GGG3TGGCGA OCAAQTGACC 1140 

CQCTCXS3CTC CTCCCTGTGT OCGTCCTOTQ TC0GCGOGC6 OQQGTGCCTT TCOOGCOGGA 1200 

QACTOGGOCG GTQTGCTTCG TGCTGTAGIT AT06TTAGTT CXHCTTCCCG AGAarGGGGOC 1260 

T"^ GCOOauaAGAC COCAQCeCCT TTOAAAAGCA AGGITTGT6C TGCQCTTCSCA GTTCGGAAAA 1320 

GCAGATGTTT AAOCCCTTGG ACTGAGGGTG GQKECOCUSC TOOGAAGAOG G&GACTAGGG 1380 

AAATQQCaGCC CTTTOCCCXC TATTGCaVTOC OOCT6COCS3A CTCCTTCOOC GCACOCAOGT 1440 

GCCCTAGATT CATBGC3«3AA AATGACCAAA TCCTGTGTAT TTGTTTTATA fTATTTAATAA 1500 

CTGXTTTAAA TGAAAGrTrr AfiTAAAAAAA ATACAAAACA AAAAGATTAA ATTGCCATTG iSfiO 

DU CTQTAGTAAG AGAAQCTCrX TGTATCTGAA GATAfiTTGTA TITGAAATTT GTGGTTTTTT 1620 

AATTTATTTA AAATTGGGGG GAGGQCaTQQ GAAGGATTTA ACACX3SATAT ATTGTIACOS 16BD 

CTGAAAATGA ACTTTATQAA CCTTTTCCAA GTTGATCIAT CCSAOrOAOGT GGCCtEGGTOa 1740 

GOGTTTCTTC TTOTACTTAT GTGGrTTITT GGCTTTTAAT ACAGAGAXTT TC3CTCC 1796 

55 Seq ED KO: C115 EHA Sequence 

miclelc Acid Accession #x 3D<_a5i532.4 
Ooding sequeocei 127.. 1215 

AcaiaTTOTTG CAAAGTGCTC AGCACTAAGG GAGCCAGOGC ACSiSCACAGC C3^GGAAGGG6 60 

AGOGAGCCCA GCXMSCCCAG CCAGCKCAGC CAGCOCGGAfi GTCaTTTGAT HGCSCCGCCTC 120 

ACaaACSSATOG ATCTGCATCT CTT<5aACTAC TCAGAGOCAG GGAACTTCTC GGACATCRGC IBO 

TOGOCATGCIl ACAQCftSOaA CTGCATOGTG GTGGACAOSS rraaTQTGTCC CAACATGCCC 240 

AACAAAAGOB TCCTGCTCTA CAGGCTCTCC TTCATTTACA TTTTCATCTT COTCSVTOSGC 300 

ATQATTGCCA ACTCJCGTGGT OaTCTGGGTG AATATCCAGG OCRAGACCAC AiGGCIATGAC 360 

ACGCACTGCT ACATCTTGAA OCTGGCCAIT GGOQACCTGT OGGTTGTCCr CACCKTOCCA 420 

GTCTQQGTGG TCRGTCTCGT QCRaCACRAC CSUSTGGCXXA TGGGOGAGCT CACGTGCAAA 480 

6XCACACAC3C TCATCTTCTC CATC3UICCXC TTCGGCAfiCA TTTTCTTCCT CACJSTOCATO 540 

AGCGTGGACC GCTACCTCTC C&TCAOCTAC TTCACCAACA COCOCAGCaG CAGGAAGAAG 600 

A1QSTACQCC OTGT0GTCT6 CATCCTGGTQ TflaCTQCTGO CCTICTGOGT G^TCTClGCJCr 660 

GACACCTACT AOCTQAAOAC CXSTCRCXSrCT GOGTCGAACA ATGAGACCCA CTGCCGGTOC 720 

TCCTACCCCG AGCaCAGCAT CAAGGAGTGG CIGATCOaCA TGGAGCTGGT CTCCGTTGTC 7B0 

TTGCSGCTTTG CCQTTCCXTT CTCCATTATC GCTGTCTTCT ACTTCCTGCT QGCXawaAGCX: B40 

A^TCGGGQT CCAGTGACXIA GGAGAAGCAC AGCRGC30GGA AQATCATCTT CTCX:TACGTC 900 

GTGGTCTTCC TTGTCTOCrQ QCTGC5CCTAC CAOGTGGOGG TGCTGCTGGA CaVTCTTCTCC 960 

ATCCTGCACr ACATCCCTTT CACC3PGCC5QG CTQSAI3C5AC6 CCCTCTTCAC GGCCX^TGCAaf 1020 

G1XACACAST GCCTGTOGCT GGTSCACTGC TGCGTCAACC CTGTCCTCTA CRGCTFCKTC 1080 

o-^ AATOGCAACT ACAGGTACSQA GCTGATGAAG GCCTTCATCT TCAAGTACTC GGCCaVAAACA 1140 

OU QGGCTCACCA AGCTCATOGA T6CCFCCAGA GTCTCAQAGA OQQAGTACTC TGCCTTGGAG 1200 

CAGAGCACCJV AATGATCTGC CCTTGGAGAGG CTCTCGGAOG GGTTTACTTG TXTTTGARCA 1260 

GGGTGATGGG CCCTATGGTT TTCTAGAGCA AAGCAAAGTA GCXTOGGGTC TTGATGCTTG 1320 

AfiTAGAGTOA AGAGGGGAGC AOQTC3CX3CCC TGCATCCBtTT CTCTCmCT CTTGATGACXS 1380 

CAGCrOTCAT TTGGCTGTGC OTQCXGACAG TTTTGCAACA GGC3U3ZUGCT6 TOTCXSCACftO 1440 
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CAGTGCTGl:G 
TTTCraTOTT 
TCTCACTATT 
TGGGAGQCAT 
AOTTTTGACr 
ATAXAAATAT 
ATCTGTGTGG 
TTQTGACATT 
TGCAAATCTG 
ATTTTTTTTT 



CBTCAGAGCC 
TCCTGAATTT 
GQTGTACCTT 
AGTGCTGACA 
AAGGATCACA 
ATAAATATAT 
TQTTTTGTAC 
AATAGTATTG 
CACACACAAC 
AATAAAGATT 



AGCrOAGGAC 
TTTATATQOT 
ATAAATGTAT 
TATATTCAQA 
CTftATTGTTA 
GCCAGTCTTG 
GGGCAOGGGA 
TAAAGTTACA 
OAACAjSTTCC 
TTTGTTTCCT 



AGGCTTGCCT 
GA7ET6TATT 
TTGAAAaTTA 
GTGTTGTAGT 
GCTGTTTTC3A 
GCTGAAATST 
TATGGAACGA 
TTTTAAAATA 
ATTTCAGAGA 



GGACTTCTGT 
TAAATITTAA 
AATATATTTT 
TTTAAGGTTA 
AATTATATAT 
TSTTATTTAOC 
AAACTGCTTT 
AACAAAAftAC 
GTTCTCTOUk 



AAGftTAOGAr 
GACTTTATTT 
ARATATTQTT 
GCGTGACTTC 
ATATAAATAT 
AIAGXmAT 
GTAATGCAGT 
TGTTC3?aaAC 
TTTGTAAGTT 



1500 
1560 
1620 
16B0 
1740 
IdQO 
1S60 
192D 
l&fiO 
2010 



Sag ZD KO: C116 DNA Sequence 

Uticleic Acid AcceBsion #> 2IH_000350.1 

CDdlxsg Beguence: a3..6903 

1 11 21 31 41 51 

I I I I I I 

CTGtSCTCTTA ACGGCGTTTA TGTOCTTTGC TGTCTGAGGG GCCTCAGCTC TGACCAATCT 
GQTCTTCGTG TGQTCATTAQ CATQOaCTTC GTQflGACAGA TACAGCTTTT GCTCTQQRAG 
AACXQGACCC IGOGGUU^ GCAAAAGATT CGCTTT6TGG IG6AACTCGT GTGGCCETIA 
TCTTTATTTC TGQTCTTGAT CIGOTTTAAGQ AAXGCC3UICC CGCTCTACAG CC3VTCATBAA 
TGGCATTTCC CXMGA3U3GC GATGOOCTCA GCAGGAATGC TGCOGTGGCT CCASOOGATC 
TTCTGCAATQ TtHAACflATCC CT6TTTTCAA AQCCCCACCC CAGGAGAATC TOCTGGAATT 
GTGTCAAACT ATAACTiACTC CATCTTGGCA AGGGTATATC GAGATTTTCA AGAACTCCTC 
ATGAATGOVC CAGAGAGCCA GCACCTTGGC CGTATTTGGA CAGAGCTACA CATCTTGICC 
CAATTCATGO ACAOGCTCOG GACTCACXXG GAQAQAATTO CAGGAAQAGO AATAGOAATA 
AGGGATATCr IGAAAGATGA AGAAAC3VCTG ACACTATTTC TCATTAAAAA CATCGGOCTG 
TCTGACrOMB TGQTCTACCT TCTGATCAAC TCTCAAGTCC OTCCAGAGCA QTICQCSCAT 
GGAGTCCOQG ACCTGGCGCT GAAQGACATC GOCTGCAGCX3 AGGCCX:TCX:T GGAGCGCTTC 
ATCATCTTCA GCCAGAGA03 CQGG6CAAAG ACGGTGCGCT ATGCCCTGTG CiCCCTCTCC 
CAGGGCACCX: TACAGTGGAT AOAAGACACT CTGTATGCXT^ ACGTGGACTT CTTCAAGCTC 
TTCCJS-rGTGC TTCCCACACT CCTAGACAGC CSXTCTCAAG GTATCAATCT GAGAmolGG 
GGAGGAATAT TATCTGATAT GTCACCAAGA ATTCAAGAGT 1TATCCAT0G GCOGAGTAT6 
CJ^GGACTTGC TOrGGGTOAC CAOOCCCCTC ATGCftGAATG GTG6TCCAGA GACCTTTACR 
AASCTGATGG GCATCCTGTC TGACCTCCTG TGTGGCTACC COGAGGGAOG TGGCTCTOGG 
6T6CZCTCCT TCaiACTGGTA TQAASACAAT AACZATAAGG CCZTTCCGGG GATTGACTCX: 
ACAAG6AAGS ATCCTATCTA TTCTTATGAC AGAAGAACAA CATCCTTTTG TAATGCATT6 
ATCCRGAGt3C TQGAGTCAAA TOOTTTAACC AAAATOGCM GGAGGGCGGC AAAGCCtTTO 

ctgatgggfta aaatoctgta cactcctgat tcacctgcag caobaagqat acigaagaat 
gccaactcaa cttttqaaga actooaacac gitaogaagt oxsgtcaaagc cioggaagaa 

GTAGGGCOCC AGATCTGGTA CTTCTTTGAC AACAGCACAC AGATGAACAT GATCAGAlCSKT 
AGCCTGOaaA ACCCAACA9I AAAAGACZT7 TTGAATAGGC JVGCTTGGGm AGAAf3GZATT 
ACTGCTGRAG OCATCCTAAA CTTCCTCTAC AAGGGCCCTC GQGAAAGCCA GGCTGADGAC 
AIGGCCAACT TOaACTCSQaQ GGAXZATATTT AAf3lTCaClG ATOGCACOCT CCOCCTQ&TC 
AATCAATAOC IXSGAGTGCT^T GGTOCTGGAT AAGTTTGAAA GCTACAATGA T6AAACTCAG 
CICACCCAAC GTGCCCrCTG TCXACTOGAO OAAAACATGT TCrGGaOQQG AGTGOlSaTC 
OCT6ACATGT ATCCCTGGAC CAGCICTCrA CCAOCCCAOG TGAAGTATAA GATCOGAATG 
GACATAGACSG TGGTGGAGAA AACCAATAAG ATTAAAGACA GGTATTQ6GA TTCTOGTCCC 
AGAGCTGATC COSTOGAAGA TTTCOGGTAC ATCTGGGGOG GGTTTGCCTA TCTGCASGAC 
A-rGGTTGAAC AGGGGATCAC AAGGAGCCAG GTGCAGGGGG AGGCXCCAGT TGGAATCTAC 

crocMcnsA tgggciaccc ctgcttostg gacgattctt tcat^tcat cctgaacogc 

TGTTTCCXrPA ^TCTTCATQGT GCTGGGhXGG ATCXACTCTG TCTCCATGAC TGTGAAGAGC 
ATCGTCTTGG AGAAGGAGTT GOSACTGAAG GAGACCTTGA AAAATC31GGG TGTCTCCAAT 
GCAGTGATTT GGTGTACCTG GTTCCTGGAC AGCTTCTCCA TCAT6TCGAT GAGCATCTTC 
CSrCCTGAOOA TAITCATCAT GCT.TGGAAGA ATCCTAIIATT ACAQCQhCCC ATTCATCCTC 
TTCCTGTTCT TGTTGGCITT CIGCACIGCG AGCATCATGC TQTGCTTTCT GCTCAGCAOC 
TTCTTCrCC3L AGBGCZAGTCr GGCA^CAGCC TGTAGTGGTG TCKTCTKTTT CiACGCTCTAC 
CTGCCACACA TCCTGTGCTT OSOCTGGCAG GACG6CATGA OOGCTGAGCt GAAGAAGGCT 
GTQAQCTTAC TGrCTrXOOT QaCATTTQOR TTTOOCAirrrG AGTACCTOGT TCSQCTTTGAA 
GAOCAAGGCX: TGGGGCTGCA GTGGAGCAAC ATOGGGAACA GTCaCAOaQA AGGGGACGAA 
T TCAG CITCC TGCXGXOCAar GCAGA-XGATG CeCCnOKrG CTSO S TGCTA TGGCmACCG 
GCTTQGTACC TTGATCAGGT* GTTT0CAG8A GACEATGQAA CCCCACTTCC TTGOIACTTT 
CITCTACAAG ACTCGTATTG GCTTAGCOGT 6AAGGGTGTT CAAOCAGAGA AGAAAGAGCC 
CTGGAAAASA CCGAGCCCCT AACAQAOaAA ACX3GAGGATC CAGAGCACXXT AGAAGGAATA 
CAOSACTOCT TCTTTGAAOG TGAGCATOCA GG6TGGGTTC CTOGOGTATG CGl^GAAGAAT 
CraOTAAAQA TTTTTGAGCC CXGltSGGCseG OCAGCTGTGG ACOGTCIGAA CATCAOCXTC 
TAOmSSJUX: AOATCftCCGC ATTCCIGGGC CACAATGQACS CTGGQAAAAC CACCACCTTG 
TCCATCCIGA CGGGTCTGTT GCiCACCAACC TCTGGGACTG TGCTOGTTOG GGGAAGG6AC 
ATTGAAACCA GCCTGGATGC AQTCXGGCJAa AGCCTTGGCA TGT6TCCACA GCACAACATC 
CTGTTCTCACC: ACC7CAOOGT GGCTQAGCAC AIGCTGTTCT ATGOCCAGCT GAAAGGAAAG 
TG0CAGGAG6 AGGCCCAGCT GGAGATGGAA GCCATSTTGQ 3U3GACACAG6 OCTCCAOOU: 
AAGCBGAATG AAGAGGCTCA GGACClATCA GGT6GCATGC AGAGAAAGCT GTOGGTTGCC 
ATTC3CCTTT6 TGGGAGATGC CRAGaTGOTCG ATTCTGGAOG AACCCACClC TGGGaTOtaAC 
CCTTACTC3aA GAOGCTCZAAT CTGGGATCTG CTCCTGAAGT ATCGCTCAG3 CAGAACCATC 
ATCAa:GOCCA CTCACCAiCAT GGACaAGOGC GACCAOCAAG GGOftCCGCAT Tl^^CATCATT 
GCGCAGGQAA GQCrCiaCTG CTCAGGCACC OCACTecTGC TGAAGAACTQ CTTTGGCACA 
GGCTTGTACT XAACCTTGGT GGSCAASATG AAAAACATCC AGAGCCAAAG GAAAGGCAGT 
GAGGGGACCT GCMCTQCTC GTCTAAGGGT TTCTCCACCA OGTOTCCAGC CCACGXOaAT 
GACCTAACTC CAGAACAAGT CCIOGATGGS GATGTAAATG AGCTGATGGA TOTAGTTCTC 
CACCATGTTC CAGAGGCAAA GCTGGTOGAG TGCATTGOTC AAGAACTTAT CTTCCTTCTX 
CCAAATAAGA ACTTCaWQCA CAGAGCATAT GCCAGCCTTT TCAGAGAGCT GGAGGASACB 
CTGGCTGACC TPGGTCTCAG CAQTTTTGGA ATTTCTQACA CTCXXXnXJOA AGAGATTTTT 
CTGAAGGTCA CGGAGGATTC TGATTCAGGA CCICTGTTTa OSGGTGGOGC TCHGCRISRAA 
AGAGAAAACS TCAACCSCCOG ACACC3CCrGC TTGOGTGCCA GAGAGAAOGC TOOACAGACA 
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CCCCAGGACT C3CAATGTCTG 
CCXXSCAGAjQC CAGAGTOCCC 
CATGTGCAG6 CGCIGCTGGT 
CTGGCaCAGR TCGTGCTCCC 
ATCCTTCCTT TTQGCGAATA 
TACACCTTCT TC3«3CATGGfl. 
CTCCTGAATA. AOnCAfiOCTT 
CCCTGTGGCA ACTCRACACC 
TTCCROAAGC AGAAATGGAC 
GAlSAAGCTCA CCATGCTGCC 
AOAACACAGC GCRGCACGGA 
TTGGTAAAAA CGTATCCITQC 
OAACAQAGCT ATGGfifiOAAT 
GAAOCACTTG TTGOCSTTTTT 
ATCACTAGAG AGGCCTOTAA 
AACATTAAQG TGTGGTTTAA 
GCCCACAAC?G OCATCXTACG 
ATCAOCGTCA TTAOCCAACC 
CTGAOC3VCTT CACJTGGATGC 

cCRGCcavscr ttotccttta 

TTTATCA6T6 GAGTGnOCCC 
AATTATTCCQ TGAQTGCTGC 
TACACTTCTC CAGAAAACCT 
GTCATTOCCA TGATGTACCC 
QCTTTATCTT GIOCTAATCT 
GAArrATTTTG ATAATAACOG 
MTGTCiTOC CCCACTTCTQ 
OTGACAGATG TCTATGCC30G 
CTGATTGGaA AGAACXTTGTT 
CTGCTGGTCC AQCGCCACTT 
AITQTTGATO AftGATOATGA 
AAAACTOACA tCCTXAAGGCT 
GCA6TG(3ACA OGCTOTGTGT 
AATGGT0CCG OCAAAACAAC 
GGGGATOCCa CCGTAGCJWWS 
ATOGQCXACT GTCCTCRiaTT 
TACCTTTATG CCaSQCTTCG 
2U3TATTAAGA GCCTGQGCCT 
GGCAACAflCC GGAAACTCTC 
CTCGATGAGC CCACCACAGG 
GTGA3CATCA TC3U3AAAAGG 

QftGSc acroT gxagccggct 

ATTCA6CATC TCAAflTCCAA 
CX^QAAGGACG ACCTGCTPCC 
CCAG6CAGTG TGCAGAGGQA 
TCCCTGGaSA GOATCTTCCA 
TACrCAGTCA CACAOAOCAC 
OftAaSTCATG ACCTOCCTCT 
T6ATCTTTCR CAjOOOCTCGT 
GQAGCSCTGTG CCCATASOGT 
GCAGAAAACA AACACAOGAG 
AOCGAARCre ACTTGCTCAC 
TTCSCCM&C COGAGAACTA 

TnacTcsrcAz ttc!aagcaga 

^CGXTTTTCh TOGAAAAAXA 



CTCCCCAGGG 

Aflacxx:GCAG 

CAAGflGATTC 
GGCTADCTTT 
CC3C30GCXTTG 
TGAACCAGGC 
TGGCAACCGC 
CTOQAAGACT 
ACAGGTCAAC 
AGAGTGCOCC 
AATTCTACAA 
TCTTATAAGA 
TTCCATTGOA 
AAGCQACCTT 
AGAAATAiCXn' 
TAACAAAGGC 
GGCCAGCCTG 
CCTGAAC!CTG 
TGTGOTTGCC 
TTTGATCCAG 
CacCAGCTAC 
GCTQerOGTG 
TCCTGCCCTT 
AGCATCCTTC 
GTTCATCGGC 
QACGCTGCrC 
CCTGGaCCGG 
GTTTGGTGaa 
OXSCCATOGTO 
CTTCCTCTCC 
TOTGGCTQAA 
ACATGAACTA 
CGGAGTTCGC 
CACATTCAAG 
CAAGAGTATT 
TGATGCAATC 
AGGTGIAOCA 
GACTGTCTAC 
CAGAGCOKTC 
GAVOGACGCC 
OAGGGClXTTG 
GGCCA TCATG 
A-J1-4-IU6AQAT 
TGACCTGAAC 
GAGGCACTAC 
GCTCCTCCTC 
ACTOQACCAG 
GCflJ50CTCQA 
TCCTGCAQCC 
CATCCAAA1X3 
tSAGCATQCAG 
CTGGAACACC 
(SAAAOCCCDGO 

AAATGCAAAT 



GCQCGGGCTG 
CTCAACACGQ 
CAACACACCA 
GTGTTTTIGG 
ACCCTTCACC 
AGTOAGCAGr 
TGCCTQAAGG 
CcrTCTOTGT 
CCTTCACCAT 
eAGGGTGCOG 
GAOCTGACGQ 
AGCAGCTTAA 
GGAAAGCTCC 
OOCCOQATCA 

GArrrocTTA 

TOGCATGCCC 
CCTAAGGACA 
ACX»AGGAQC 
ATCTGDGTGA 
GASGGGGTGA 
TGGGTGACCA 
G3CATCTTCA 
GTGGCACTGC 
CiaTTTGAIQ 
ATCAAjCAGCA 
AGGTTCAACG 
GGCCTCATTG 
GAGCACTCTG 
CTQC^AGGGG 
CAATOGATTG 
GAAAGACAAA 
ACCAAQATTT 
CCTGGAGAGT 
ATGCICACTG 
TTAACCAATA 
GATGAGCTGC 
GCAOAAGAAA 
GOOGACTGGG 
GCACTCATTQ 
CAGGCAC6GC 
6TC5CTCACAT 
GTAAAGGGO& 
G ^TAT ATQg 
OCTGTQGAGC 
AACATGCTCX: 
TCCCACSIAGO 
GTGTTIGTAA 
GCIGCTGGAd 
A0AAACSAAC 
GACTGGCCCA 
CGAATTCaSA 
TGATGGTGAA 
GCCATCGCAC 

TOCAxerrxG 

GCACTCATCA 



CTCACOCAGA GGGCCAGCSCT 4060 
GGACACAGCT GGTCCTCCAQ 4140 
TCCJGCAGCXZA CAAQGACTTC 4200 
CTCTQATGCT TTCTATTGTT 4260 
OCTGGATATA TGGGCAGCAG 433C 
TCACGGrACT TGCAGAOffTC 4380 
AAGGGTOGCT TOCGGAGTAC 4440 
GCCCAAACAT CACCCAGCIG 4500 
CCTGCAGSK3 CAGCACCAGG 4560 
GGGGCSCTCCC GCCCCCCCAG 4620 
ACAGQAACAT CTCOGACTTC 46B0 
AGAGCAAATT CTQGGTCAAT 4740 
CAGTCOTCOC CATCADGGGG 4900 
fGAATGTGAQ OGGGQGCXET 486 Q 
AACATC3!AGA AACTGAAGAC 4920 
TGOTCAGCTT TCTCAATGTQ 
GGAGCCCOGA QQAGTATGGA 5040 

AGCTCTCAGA GATTACAGTQ 51 00 

TTTTCTCCAT QTOCTTOGTC 5160 

ACAAATOCAA GGRCCTCCAQ 5220 

ACTTCCTCT6 GGACATCATG 52 BO 

TCQGGTTTCA GAAGAAAGCC 5340 

TOCTGCTGTA TGGATGQGOG 5400 

TOOCCASCAC AGCCTAT6TO 5460 

OTGCXATTAC CTTCATCTTG 5520 

COSTOCTGAQ GAAGOTGCTC S5B0 

ACCTTGCRCT GAGOCftflQCT 5640 

CAAAlCOGTT CCACTGGGAC 5700 

TGGTGTACTT OCTCCTGACC 5760 

COSAGCCCAC TAAGGAQCCC 5620 

GAATTATTAC TGGTGGAAAT 5880 

ATCTG GGCAC CTCCAGCCCA 5940 

GCTTTGGCXrr CCTGGGAGTQ 60QO 

O^MOkCCAC AGTCACCTCA 6060 

TTTCTGAAQT CCATCAAAAT 6120 

TCACaiSQAOG AOAACATCTT 6180 

TCQAAAAGQT TGCRAACTGG 6240 

TGGCTQGCAC 6TACAfilX3GG 6300 

GCTOOCOiCC GCaraGTGCTQ 6360 

<3CATGCT(3IG GAAOGTCATC 6420 

CCCACAGCAT GGAAGAATGT 6480 

CCTTTC3GATG TATGGGCACC 6540 

TtaCAATGAA GATCAAATOC 6600 

AOTTCITCXA flOGGAACTTC 6660 

A^WJCMOOT CTOCTCCKIC 6720 

ACAGCCTGCT CATOQAGGAG 6780 

ATTTTGCTAA ACAGCAGACT 6840 

OCftGTCXSACA AGOCCAGGftC 6900 

VCTOGGCAGC TGGftOQOGCA 6960 

QQCSTAA ATGA CCOCACTGCA 7020 

AAGAGGTCTT TCAQAAGGAA 70 BO 

^"^JCAAAQIAA l-ACAAAATCC 7140 

TAGCAOCTTT QGOCTCCATA 7200 

MJOTGTerC TOOErTGMT 7260 

CAAAAAAAAA AAAAAAAA 7318 



Beg ID MO: CII7 DNA Sequenoe 

miolelc Acid Accession lJM_006fi7i.2 

Oodlsg aequttice: 138.. 1820 



1 
I 

GGCAOGAGQC 
BCCXZAfSACCC 
AGCAGGCCAQ 
fiGCaOGAATGG 
TCTTCTTGAO 
AQCTCCTGAT 

coGSAcrrGc 

ACTACCTGTG 
ACCXAdGGCAG 
CAGOCGATSC 
CATTCAAACA 
AGGAOGOCXX: 
TGCaotAACiT 
QCACCAGCOA 
TGCTGGGCOG 
AGTGBGTCAT 
TCATTG0C3GQ 
TCTACTCAGT 

tctacttctt 
toctcatoqc 
tgctggaoaa 
ccatcaacat 



11 
I 

TGGTGTTTAO 
TGTGCCCCCG 
GCTCRCCATG 
ACIOCTCATC 



CTOCCTOGAI 
QAOGACCTTC 
OGCGGCCCAS 
CCTGTTGBAC 
GTACGQCAOC 
TCXITOGGOGQ 
OGCCCIGGAC 
TGOCATGAAX 
CA7GGG!DGftC 
OAAGaVTCGTG 
TAAOATCCTG 
CACCXSTGOTG 
CATCAGCAAQ 
GCTGGCCACC 
CAAOCACATC 
OGACXaGCACT 



21 
I 

CAACrCCTAC 
GCCGGGCTCT 
OTGCOGCATG 
CTGTCTGTGC 
CTCTCACCAC 

GOCAAGAOCT 
ATGGCTGTC% 
AAGGAGACCA 
CTCSKPC30QGA 
AAOAOCACCC 
ATCCTCATCT 
CTQhlCCCCGC 
GTGCTGGGCA 
AGOSGGGGCC 
GOGGTGGCTG 
GAGATGGACX; 
TGOGGGCTGQ 
AAGAATOCCA 
TOCTCCAGCr 
GAOCG6CGCA 
GGGCXCZAOG 



31 
I 

CACCTGCCTG 
CATG03TGGA 
CXSaCTTGOC 
TGTCTGTCAT 
AGOAAATTAG 
TOCTGOC3RCT 
CTAgCOGCGT 
TOGTGGGCAT 



ACATGTTOOC 
CAfiTTGTCRA 
AOGOGGTCCA 
OGCCOGAGQT 
TCGTCTTCTT 

ccercGTCAG 

TQTGGTATTT 
ACCCCAGGGC 
TGCTOCROGG 
TCGTCTTCTT 
CAOCXa^CACT 
TCGCTOGCIT 
AGGCTGTQGC 



41 
I 

CTQAGGGGCT 
ATGGTGCTQT 
AOSGGGC3M8G 
OGTGGGCMC 
TTACTTCCAG 
OGTOGTCTOC 
GSGGGSCSCTC 
CTTCATGGTC 
TGGGAAQOCC 
AGCCAACCTA 
GTOOCCCAAG 
GGA03AG&AT 
OGTTTACAAG 
CTOT3CCACC 
CTTCTGCCRa 
CCJOCTTOaGC 
06TCGGCAA6 
GCTCTTTASCC 
CCQOGGCATC 
GCOCATCACC 
OGTQCTGOCC 
OGOCATCTTC 



51 
I 

AGAGCCXrCCA 
GC3CCCTXGCC 
QAOGTGTQGA 
CTCCTOGQCT 
TTCCCTGGAG 
AGCTTGSATGT 
ACGGTGGOGr 
TCCATCATCC 
ATCATOAGCT 
GTAGAAGOCA 
dTGGC»CCAG 
GGCTCGCAXG 
IC3GAGCCGG 
ATGGGCATCA 
TGCCTCAATQ 
ATT6TQTT0C 
AAGCTGOGCI 
CTGCCCCIOC 
CIGCAGGCTC 
TTCAAGTGCC 
GXOGGIGCCa 
ATGGCCOXGG 
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1275 



wo 03/042661 



PCT/US02/36810 



TCAACAACTA CGAGCTQGAC TTTGQOCAOA TCATCACCAT CAGTATCAC& GCCACTGCM3 13 BO 

CC2U?CKTTGG GGCMXTrGGC ATCCCOCS^ CCGOCCTCGT CACCATGGTC ATCHTGCTCA. 3.440 

CCTCGGTGOC5 ACTGOCCACC GATQACATCA CCCTCATCAT TGCOGTTGAC TtSGGCTCTGG 1500 

ACCGTTTCCG CACCATGATT AACGTaCTGG GTGATGCGCT QGCAGCBGGG ATCATGGCCX: 1560 

5 ATATAT6TC6 GAAGGATTXr GCCCGGGACA CAGGCACCX?A GAAACTGCTG CCC1GC6AGA 1620 

CCAAIQCCAGT OAGCCTCCAG GAGATC3GTGG CAGCCCAGCA QAATGGCTGi: GTGAAGAGTC3 3.6S0 

TASCGSAGGC CTCGGAGCTC ACCCTGGGCX: CCAGCTGCCC CCACCACGTC CCCGTTCAAG 1740 

TGGASCGGGA TGAGGAGCTG OCXX3CTGCGA GTCIGAACCA CTGCACCATC CAGATCAGCG 1800 

AGCTCSGftQAC CAATGTCTQA GCCTGCGGAG CTGCW3Q6GC AQGCaSflCSGCC TCCAGGGGCR 1860 

10 GGGTCCTGAG GCAOGAACTC GACTCTCCAA CCCTCCTGAG CA6CCGGCAG GGGCCAGGAT 1920 

CACACATTCr TdCRdCClT GAOAGGCTGG AATTAACCCC GCTTQAOGQA AAATGTATCT 19B0 

C3U3AGAA0GG AAAGGCTGCA TGGGGGASCC CCATCCAGGG AGTGATGOGC CCGaC&TlGC 2040 

CTGAGQCCCC GCTGTGACAQ TTTCCCCGGT GTGRGCCC33a TGAOGGCCMC AGGCAGGGGT 2100 

TATCCGGCXX: CACTTTCTGG ATOACRGACT XGAGGCTCTG AGAGCTGAAA ACACTTQTCC 2160 

15 AAGGTCTCAC GTTAAGGTCA AGACACTAAC TGAAATCTTT CAAQCCCOGC CTCTCCTCTT 2220 

OGAGGACAGG GCAGCCTQCA aCTOTGTCSCA GGCCCAGGCC GCACGCCATA ACAGGTOGCC 22 BO 

TCAGC3CACAC AGI^rCTCOCC AAGGGGAGCA GCC(3U3GaCC AAGCCCCGCT GCX^TTOCOCA 2340 

GGCCACAGTG COTCCAGTCT CCT6TCCTGC CACGTGTCTT TTGCAAftGCT CCTTQQATGT 2400 

QQTWaACnflAT GTCTTTAGTA GAGCTGAAAG QCCCCCTTGA CACRTCCAGG CCAACCTCCC 2460 

20 ATGGAATAGG TAGGCAAGOC AGOACTCCGG GAAS6ASGT6 CAGCCAGGAT OCTCTGGTGQ 2520 

AGCTOCGGAT GGGGCGCXGG TSTCAGAACT CCCCAAAOGC CTGTGCGTOC AAGIGGAGTC 25S0 

AGGTTTTCTA TTCCTTTCWS 1X3TTTGCAAA TTCAGTGTTA ACXAAATAAA GGTATTTTGT 2640 

TTTTCAAAAA AA2^AAAAAAA AAA 2663 

25 Seq ID £V02 C118 DNA Sequence 

nucleic Acid Accession #s ]9H„005689 
CSodiag sequences 278_2ao6 " 

1 11 21 31 41 51 

30 I I 1 I I I 

GGGCCTGCAG TTGOGAGAnS GGTOCCGGGC CCKGAGCCAG CGQGSCCGTQ CTGAGACGGC 60 

GTACGTGCCC TGOGTGAGTG CGTGGCOSCO GGGCGTGGGC TAGOGGaSTG OGCGGTGIU9G 120 

CCTGGTOCAC GTGOGTCCCT TCCCGGGACC CCCGCAGCTT GGCGCOC3«3C QOCTAOGrGA 180 

OCCSIAQGCAC CCGGATGTOC GOGCCCCTCT GOQRGTGACA AGTCCCGGCC TCCQGTCCCG 240 

35 CAGTGCCCGC AGCCTCQQCC GQCGTCCACG CATTGCCATG GTOACTOTDa GCAACTACTG 300 

CJGAiaGCCGAA GG6CCOGTGG GTGOCOCCIG GAXGCAOGAT GGCCTGaGTC CCI6CTTCTT 360 

CTTCAOGCTC GTaCCCTCQA CSCGGATGGC TCTAG6GACT CTOGCCXTGG TIGCrGGCVGT 420 

TCCCTGCA6& CGOOGGGAGC GGCCOGCTGQ TGCTGATTCG CTGTCTTGGG GGSCCGGOCC 480 

TCGCRTCTCX CCCTACGrGC TGCftGCTGCT TCTGGCCACA CTTCAOGOaQ CGCTGCCOrr 540 

40 OGCCGGCCIG GCTGGOOQGG TGGGGACTQC COGOGGGGCC CCftCTGCCAA GCTATCTACT 600 

7CT6G0CXOC OTGCTSQAGA GTCTGGCOGG CQCCIGTGGC CrOTGGCTGC TXGXOSIGGA. 660 

GGGGAGQCAG GCftOGGCAGC GTCTGGCAAT GGOCATCrGG ATCAAGTICA GGCACRGOCC 720 

TGGTCrCCTG CTCCTCTGGA CTGTGGCGTT TGCA6CTGAS AACTTGQCCC IGGTGTCTTG 760 

GAACAGCCCA CftGTGGTGGT GGGCAAGGGC AGACTTGGGC CRACAGGTTC AGTTTAGCCT 840 

45 GTGGGTGCTG COGTATGTGG TCICTiaGAGG GCXGTTTGTC CTOGGTCTCT GGGCCCCTG6 900 

ACTTCGTGOC CAGTCGTATA CATT6CA0GT TCATGMGAa GACCAAGAT6 TGGAAaSGAG 960 

OC3I^GGTTGG6 TdVGCAGOCC AAQUXTCl'AC CIGGOG^kGAT TTTGGCAGGA AGCTGCBCCT 1020 

CCTGAGIOGC TACCIGTGGC CT06AGGQAG Td^UMT^rG CtUSCTGGTGG TGCTCKTCTO 1080 

_^ CCTGGCSGCTC ATGGGnTOa AAOSGGCACT CAftTGTGTTG GTGCCTATAT TCTATAGGAA 1140 

50 CATTGTGAAC TTGCXGACTG AGAAGGCACC TTGGAAClCir CIGGCCTGGA CT6TTA0CV? 1200 

TTAOGTCTTC CZCAaGTTCC TOCAGGGGGG TGGCACTGGC AOTACAOQCC TOGIGASOA 1260 

CCTGGGCACC TTCCTGTGGA TCOGGGTGCA GCAGTTCACa TCTCGGCG06 TGGASdOCT 1320 

CATCTTCTCC CAGCTGCACG AGCTCTCACT GOSdGGCAC CIGGGGCGGC GCACAGGGGA 1380 

OGTGCIGOQG ATOGOGGATC GOGGGACATC GAGTGTCACA GGGCXGCTCA QCTACCTGGir 1440 

55 GSTCAAT6XC ATGCCCACGC T06C06ACAT CATCATTGOC ATGAICXACT TCAGCATQTT 1500 

CTTCNWGGC TGOTTiaaCC TCftXXGXGTT GCTGIGCAT6 A9XCTTTAC3C TCAOCSCaXSAC 1560 

CATTOTOGTC ACTGftGTGGA GftACdUUTTT TOGTCSTQCT ATTGAACACAC ASGMSAAOGC 1620 

TAGC06GGCA OSAGCASTGG ACTCTC7GCT AAACTTOGAG ACGOTGAABT ATTACAACXSC 1680 

CSRGfUGTlAC GAAGTGGAAC GCTATCQAGA GQCCATCRTC AAATATCAGG GTTTGGaOTQ 1740 

60 GAAGTOQAGC GCTTCS^CXGG TTTTACIAAA TCAQAOCCA9 AACCTGGTGA TTGGGCTOSG 1800 

GCTCCTCGCC GGCTOCCTGC TTTGCGCATA CITTGTCACr GAG<7U9AAGC TACAGGITGG 1860 

GGACTATCXTO CTCCTTGGCA CCTACATTAT CCAGCIGTAC ATGCCCCTCA ATIGGTTIGQ 1920 

CAOCTACTAC AG3KTGATCC AGaCCAACTT CATTGACATG GaGAACATGl! TT{3aiCTTGCT 1980 

^ GAAAGASGA6 ACAGAAGTGA AGGACCTTCC TGGAGCAGGG OCCCTTCGCr TTCAQAAGGG 204D 

65 OCGTATTGAG TTTG2^GAACG '1:GCACTTCAG CTATQCXSAT GGGGGGGAGA CTCTGCAGGA 2100 

CGTGrCXTTC ACTGTGATGC CTGGRCAGAC ACTTGCCCTXS QIGGGCGCAT CTOGOGCAGG 2160 

GAM3AGCACA ATnTGCGGC TGCTGTTTCG CrTTCIACGAC ATCAGGTCTQ GCTGCATCCG 222Q 

AATAGATGGG CAGOACRTTT CACAGGTGAC CO^SGCCTCT CTTCCGGTCTC ACATTGGAGT 2280 

TGTGCCCCAA GACACTGTCC TCXTTAATGA CAOCArrOGOC GACAATATOC GTTACOaGCG 2340 

70 TBICACA8CT GGGAATGATG AGGTGGAGGC TOCTGCTCAG GCTGCAOaCA TCTCArTGATGC 2400 

CATTATGGCT DTCCCIGAAG GGTACAGGAC ACnGGTGOGC GAGOGGGGAC TGAAGCrFGAG 2460 

CGGOOGGGAG AAGCAGOGCG TCGCX:ATrGC CCGCAGCATC CTCAAQGCSC COGGCATCAT 2520 

TCTGCTGQAT GAGGCAAOGT CW3GGCTQGA TACATCTAAT GAGRGGGCCA TCCAGGCTTC 25B0 

TCTGGCCAAA GTCIGTOOCA ACOQCACCAC CATOGTAGTQ GCACSVCAGGC TCTCAACTOT 2640 

75 GGTCAATGCT GACCAGATCC TOGTCRTCAA GGATGGCTGC ATOGTGGAGA GGGSACGACA 2700 

CGAGGCTCTQ TTGTCCCOaG GTGGGGTQI'A TQCTGACATG TGGCAGCTGC AGCRGOQACA 2760 

GOAAQAAACC TCTGAAGACA CTAAGCCTCA GACCATGGAA COaXQACAAA AQTTTGGCCA 2820 
CTTCCCTCTC AAAOACTAAC CCAGAAGOGA AtAAGATGTG TCTCCTTTOC CTOGCITATT 28 DO 

TCA7GCTGGT CTTGGGGTAT GGTGCTAGCT ATOGTAAGGG AAAGGGACCT IrTCGQAMVAA 2940 

80 CATCTTTTGG GGAAAXAAAA ATGTGGACTG TG^AAAAAAA AAAAAAAAAA AZyV 2993 

seq ID NO: C119 SNA Sequence 
miclelc Acid AcceBBlon it: NH_DO0fi76 
Coding sequences 333.. 1331 
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} XI 21 31 41 SI 

I I I I i i 

GGGCflATTTG TTAGTTATCC GOCGCCAOCA. AGAOSCGGCA O0GCGC3CXGG ACCGQAGGOG €0 

J CCCOGCGC6Q GOGCGAACTT TGGGCTOSGG CGAGTGGGIG QTGCTOCGCC CAOCCC6AGA 120 

CGGGCGGGCG CGCGGGGCAA TGG6TGOCGC CrCTTCSOCOG CGGGGGGCCC CaAOCX3GTO& IflO 

GTCCOSGCCA CCAGCGCCCC AGCXZCOQAQQ CTOW5AAGOG GCAGGCGGAG GGGCXSGTCGG 240 

GGOSCTArGG CCATQCCGGG CGGGTCTCAC GcaaCTQCCC CTCGCCCGGC GCJGCCTTCQQ 300 

^ - TAGOGGGOaC COGGGGCCCA OCXGGCOOSG CCATGCTGCT GGAGACaC^G GADSCXXTTGT 360 

lU ACGTGGOGCr GaAGCTOGTC ATCQCCGCOC TTTC5GGTGGC GOGCAJVOGTG CIGGTCTQCQ 420 

COGOGGTGGO CAOGGCGAAC ACTCTGCAGA CGCCCft.CCftA CTACTTCCTO QTOTCOCTGG 480 

CTQCMJCCXrGA COTGQCCGTG GGGCTCTTCQ CCATCCCKTT TGCCATCMC ATCAGCCTQQ 540 

GCTICTGCAC TCSACTTCTAC QGCTGCCTCT TCCTCGCCTG CTTCJGrQCTQ GTGCTCACGC! 600 

^ ABJUJCTOCK:^ CrrCftOCSCTT CrCfGCKGrOa CAOTCGACAG ATACCTGGCC ATCrGrOTCC 660 

ID OGCTCAISGTA TAAAAGTrTG GTCRCGGGGA OCXXSAGCAAG AGGGGTCATT QCTGTCClCr 720 

GGQTOCTTGC CTTTOGCATC GGATTGACTC CATTCCTGGG GTOOAACAGT AAAGACAQIQ 780 

CCACCAACAA CTGCACA6AA CCCTGGGATG GAACCftCGAA TCAARGCTGC TOCCTTGTGA 840 

AGTOTCTCTT TGftGAATGTG GTCOCCaTGA QCTACATGGT ATATTTCaAT TTCTTTOGGT 900 

GTGTTCTGCC CXXACTGCTT ATAATOCrGG TGATCTACRT TAAGATCTTC crGGTCGCXn- 9S0 

QCBGGCAGCT TCAGCJCaCACT GfiGCTGATOG ACCftCTCGAG GACCAOCCTC CAGCGGGftJGA 1020 

TCCATGCROC CAAGTCACTG GCCATGATTG TGGGGATTTT TGCOCTGTBC TGGTTAOCTQ 1080 

TGCRTGCTGr OrAACTQTGTC ACTCTTXTCC AflCCAGCTCA GGGTAAAAAT AAGGC3CAAGT 1140 

GGGCAATGftA TATGGOCMTT CTTCTQTCAC ATGGCSArrc ACTTGTC31AT CCCATTGTCT 1200 

ATGCTTIkCCXS GftACCGAGAC TTOCGCTZ^ CTTTTCRCAA AATTATCTCC AQGrATCTTC 1260 

ZD TCTGOCSiAGC AGATGTCAAG JUSXGGGAATG GTCAGGCIGO GOTACAGCCT GCTCTOGGTC 1320 

TGGGCKTATG ATCTAGGCTC TOGCCTCTTC CAGQJW2RAGA TACPkAATCCA CAAGAAACM 1380 

AGAGGACACG GCTQQTTTTC ATTGTGAaAB ATAOCTACAC CTCACMSGA AATQQRCTGC 1440 

cxcrcTTQao cacrrcccTG oRQcrACJCRC gtatctrgct aatatgtatq rrercAGTAGr 1500 

Qjn. AGOCTCCRAG GMrfGACZlAA TATATTTAIS ATCTATTCAG CTGCITTTAC TCT6TQQATT 1S60 

ATGCGAACMS CTTGUVTGGA TUCTAACSUSA CTCTTTTOTT TTTAAAAGTC TGCXTTTGTTT 1S20 

ATGt^TBGAAA ATTACIGAAA CTATTTTACT GlGAftACAGT GTGi\ACrATT ATAAlGCAAA 1680 

TACTTTTl!ftA CTTABAGGCA AltSGAAAIUIT AAAAGTTGAC T^TACTAAAA A3G 1733 



35 



40 



Seg ID HO: CL20 JDHA Sequence 
NUclelc: Add Accession #: NM_052932 
Godlng sequences 217.. 786 



1 2.1 21 31 41 51 

' i I I I 1 

CCCA^SOCCGG CCCC6C0GOC CCX3GCTGG3C ACGCGACGCC CCCTCCfiGGC CCCXSCTOCTG 60 

CGCCCTATTT GGTCATTCSaO GGGGCAAGCG QCGGGAGGGG AAACQTGCGC GGGOSAAQGG 120 

QAAGOGGAGC CGGCJGCCGGC TCCXSCAGAGG AGCOSCPCTC QCCGOCGCCA CCTCG6CTGG 180 

GASOCCACGA GGCTGCCDOCA TCCTGOCXTTC GGAACAATOG GAOTOGGOGC GCGAGOTGCT 240 

^ TGGGCOGGGC 1X3CTCCT6GG GACSQC^fGCAG GTGCTAGCOC TGCTGGGGGG OGCCCATGAA 300 

AGCSCAGCCA TGGC3QGAGAC TCTCCAACSiT OTGCCTTCTG AOOVTACRTUi TGAAACTTCC 360 

AACAGTACra TQAAACCAOC AACTTCAGTT GOCTCftQACT CCA6TAATAC AflCGGTCACX: 420 

ACXS^TGAAAC CTACAGCGGC ATCTAATaCA ACaACACCAG GQATGGTCTC AftCftAATATC 480 

ACTTCTACCA CCTTAAftGTC TAC31C3CCAAA ACAaCRAOTG TTTCACftfiRA CRCATCTCAfi 540 

AtAXOWAT CCACAATOAC OCTAAOCCAC AATAOTTCAG TGACftTCTGC TCCTTCATCA 600 

DU GTAACAATGA CAACAACTAT GCATTCTOAA GCAAAGAAAfi GATCAAAATT TQATACTGQG 6C0 

AGCTTTQTTG GTGGTATTOT ATTAAOGCTG GGAQTTTTAT CTATTCTTTA CATTGQATaC 720 

AAAATGTATT 2UJCCAAGAAG AGGCATTOOG TATCGAACCA TAGATCAACA 1GA1GCCATC 780 

ATTTAAGGAA ATOCATQGAC CAAGGATGGA ATACAGATTG ATGCTGCCCT ATCAATTAAT 840 

TTXQOTTTAT TAATAflTTTA AAACAATATT CTCTTTTtGA AAATAOTATA AACAQGOCAT 900 

DD G^XSKTMOXi TACnX3T6TAT TA03TAAATA TGTAAAGATT CTTC3VAGCSIA ACAAQQGTTT 960 

GG6TTTTGAA ATAAACATCT GGATCTTATA GRCCOTTCAT ACAATGOTTT TAGCAAQTTC 1020 

ATAGTAAGAC AAACAAGTOC TATCTTTTTT TTTnGGCTG GGGTQGGOGC ATTOGTCACA 1060 

TATGACCSGT AATTGAAAGA OGTCATCACX GAAAGACAGA AIGCCATCTG GGCATACAAA 1140 

TAAGAAGTir GTC3UAGCAC ^rCAGGATTTT GGGTATCTTT TOTAGCIQU:: ATAAAGAACT 1200 

OU TCftenSCTTT TCAGAGCTGQ ATATATCTTA AtfTACiaWTO OCAGACftOAA ATTAaMAAT 1260 

CAAACTA6AT CTGAAlK^TA ATTTAAGAAA AACATCAACA TTTTTTGTGC ^ETTAAACTGT 1320 

AGTAGTTGGr CTAQAAACAA AATACTCCAA GAAAAAGAAA ATTTTCAAAT AAAACCCAAA 1380 

ATAATAGCTT TGCTTABCCC TCTTAGQGAT CSCATTGOAGC AITAAGGAGC ACATATTTXT 1440 

ATTAACTTCr TI!n»GCTTT CAAIGTTaAT GTAAITTTT6 TTCTCTGTGT AATTTAGGTA 1500 

AACTGC&GT6 TTTRACATAA TAATGTTTTA AAGACITABT TGT CftC3TA TT AAATAATC3CT 1560 

GGCaVITATAC GGAAAAAACC TCCTAQAAGT TAGATTATTT GCTACTGTGA GAATATIGTC 1620 

ACCACTGGAA GTTACTTTAG TTCATTTAAT OTrAATTTTA TATTTTGTGA ATATTTTAAa 1680 

AACTGTAOAG CTGCTTTCAA TATCTAQAAA TTTTTAATTG AQTGTAAACA CacCTAACTT 1740 

„ TAflGflA AARG AACOGCTTGT ATGATTTICA AAAOAACATT TAGAATTCTA TAGAGTCAAA IBOO 

/U aCTATAGOST AAaJGCTGTCT TTATTAftSOC AGGGATTGIG QGACTTCSCCC CAGGCAACTA 18 SO 

AA£X:TGCAGG AT GAAA ATGC TATATTTTCr TTCATGCACT GTGGATATTA CTCSAQATTTG 1920 

QGOAAATGAC ATTT TTAT AC TAAAACAAAC AOCAAAATAT rTTTAGAATAA ATTCTTAGAA 19B0 

AGTTTTGAGA GQAATTTTTA GAGAGGACAT TTOCTOCTTC CTGATTTGGA TATTCCCTCA 2040 

AATOCCICCT CTIACTCCAT GCXGAAGGAG AAGTACTCTC AQATGCATIA TGTTAATGGA 2100 

GAGAAAAJMSC ACAGTATTGT AGAGACACCA ATAITAGCTA ATGrATTTTG GAGTOTTTTC 21GQ 

CAITTTACAQ TTTATATTCC AGCACTCAAA ACTCAfiQQTC AAGTriTAAC AAAAGAGGtA 2220 

TGTAQTCACA GTAAATACIA AQATGGCRTT TCTATCTCAG AGQQCCAAAG TGAATCACAC 2200 

OUSTTTCTGA AGGTCXSPAAA AATAQCTCftG ATGTCCTAAU* QAACATGCAC CTACATTTAA 2340 

ftfl ATAAAACTGT TGTCRGCTIT TGlTTTACfta ACaACOCTAG ATA^TAAOAA 2400 

OU TTTTGAAATG GATCATTTCT ACTTGCTGTC CATTTTAACC AATAATPCTGA TQAATATAGA 2460 

AAAAAATGAT CCAAAATATG GATATGATTG 6ATGTATGTA AO^CATACAT GGAGTAIIGGA 2520 

OGAAATTTTC TSAAAAATAC ATTTAGRTTA GTrTAGrTTG A^USQAGAGGT OOaCTGATGG 2 580 

CTGAGTTGTA TGTTACTAAC TTGGCX:CTGA ClGGTTGXGC AACCATTGCT TCATTTCTTT 2640 

GCAAAATGTA GlTAAGATAtT ACTTTATTCT AATGAAGGCC TTTTARATTT GTCCACIGCA 2700 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TTCTTGOrAT 
TTTGAAfACT 
CATCATTTGC 
TCTCTGTATa 
TmTGTTGT 
TTGGACCATA 
TATTTATCTG 
TTAAARTTTC 
CAGTATATTA 
TTTTTATTAT 
AATTGTTCTC 



TTCACTACTT 
TGTTTCTTTA 
CTTQAAAGTT 
TAAGTAGTAT 
TGTTATCTAT 
TTTCTTAAfiC 
AAOATCAAAA 
CTTGAATAAT 
TAGTGATAAT 
AATGACCAQC 
CTTCTAAAAA 



CftAQTCMSTC 
GC31CTTTGAA 
TCCTCTGCAr 
AATTTGTTAC 
AAAAATTGAlS 
ATAAAAAAAT 
CSUUClCAATCC 
CCATGAAAGG 
TTTGTATTTT 
TTTTGGTATT 
AAAAAAAAAA 



AGftACTTCaT 
GATA6AAAAA 
TGOGTrrOAA 
TTTC3UUiTAC 
□GAAAmoaTT 

gctcagtttt 

AQATOTATAA 
AATAATTCAA 

CAAM7UUVAAA 
TCTJirOTTAC 



Seq ID HOx CL2a DMA Sequence 
Nucleic Acid Accession *s KH_004195 
Coding sequence i l . . 72€ 



1 
1 

ATGC3CACAfiC 
□GBCTC?W»CC 
CTTGGGAJCGG 
raCCCSGQGOG 
TG03GAGACC 
CAGTCGCAGQ 
TCGGGGGGCC 
ACT01X3TTCC 
GAGOCX3CTTG 
iW:CTCJGGOCC 
GAGAOCCAGC 
OCGGA06AAG 
GrXGTGA 



11 

I 

ACBQOQOaAT 
TGGGTCAGas 
GAACSQGACGC 
AGGAfiTOClG 
CTTGCTOCAC 
GGAAATTCAG 
AGG2UUSGC3CA 
CTGOGAACAA 
GGTGGCnSAC 
AaCTTGOACT 

AGGGGGGC3C3A 



31 
1 

CSGGOGOSTTT 
CCCCACCGQG 
GGGCTGCTGC 
TTCGGZUSTQQ 
CB«:CTGO0GG 
TTTIOGCTTC 
CrqCAAACCT 
GAGCCACAAC 



GCACATCTGG 
GGTGCCGCCQ 
GOGATOGCTCA 




CQC3GTTCACA 
QACTGCATGT 
CACC&CC3CTT 
CAOTOTATCG 
TGGACAGACT 
GCT&rOTGCQ 
CTGGOOGTGG 
CAGCTGAGQA 
TOaACCGAAG 
GAGGAGAAGG 



Seq ID NOt 0122 DNA Sequence 
nucleic Acid Accession 8: AK091896.1 
Coding sequence : 2 B . . 15 7 2 



1 
1 

AGATCGGCGA 
C3GCCGCAACC 
ATCOCCTTCC 
CAGATCTCCT 
GGOGTCTTCA 
ATCTCX3CT06 
ATGGCJQCTGG 
AGGATGTACC 
GGTQCTCTGC 
GCCMVTAGCA 

AAGGAOGGGG 
OaUStGCCCA 
CIAQAQOAGGC 
AAGGAAGATQ 

ATCAOGGSOS 
GTGTACSuaCT 
CCCRSOCTCr 
AGUUCGAAQC 



CTGTTTCrCA 
AAAGGCTGTG 
ATGCTGGTTG 
AXCrEIGGTT 
GCTOGACTCC 
TACC?bG2US6T 
CnCGACCATA 
AAGTCTTCTG 
TCAAAATCAT 
TOGTAAGAGC 
ATTTGTCACC 
TAAAAATAOG 
TTAAGCAAAA 
CITAATAGTC 
C&AACACCAA 
TAAGCTOOCT 
TATTAAATCC 
TTTAAAACAA 



11 
[ 

GCCCGTCAGC 
TGCafiCCCM 
TGGGGCCCAC 
OGGTCTTCTT 
AAAGGACCCT 

TGrrrQcscor 

OGGGCTTGGC 
ABAAOOACTC 
TGAGCCCCCT 
CGOCCTAAGAC 
ACGTAGATGC 

TGGCTGTGCT 
CCCTGCTTCT 

GCX»AAGGAA 
CGCTGGTACr 
ATGCTGTGGA 

TCaoaaacTT 

OSGCCAOCAT 

QCAGCTCCTT 
CAACGACAGX 
6TTCSATATT 
GrCTGGCTTT 
CATOkGTTCC 
AAAACTGGGT 
CTGTTTGASA 
CACTAAAACI 
TAlffl\AGTTTA 
TGTCC3U3ATA 
TO^TGCATQG 
TTGRAATTGT 
CAAGCTGCAA 
CAAAGTCTCT 
AAG6AAAA9T 
CATAOCAACT 
AGCCASTATT 
AGOTAATAAA 



21 
I 

CTGOGCCAIG 
GCTCACCTAC 
GCIGCTPOGAC 
CTCGGAGCAG 
GGCCCAGTCA 
CATCOCCTTC 
CATGGGCTSC 
GGCOGTCTTC 



31 
I 

GdCTQCOACG 

tggagggtct 
ctgcsqctqtc 
ctctgcctcx? 
ctatgggcoc 
tgoogoqacg 
atcgagacog 



CACCTCCXX3A 
CAAGCCTTGG 
AGTGTOCTAT 
GATGCTSCT6 



GOOCCCAAAG 
GAAOCTCAQA 
GTTCATGAC6 
GAAGGCCCHG 
CATCACACI1G 
GGTTTTCATC 
CAAOTTCGTC 
<X:CCAGCATG 
GCTGOTQACA 
CCAGGCTCAG 
TAOCTTCTAT 
TAOOCAA6AC 
GftAGAAOGCA 
AA GCrG GGTq 
IGGTrGGGTA 
CCTGGCTXCT 
CCCAGATOGG 
ACCATACTCT 
AAAGCTCOUr 
OTTATTCCCT 
TAAATT<3CTa 
TGAATCTTGC 
TA6AAAAGCT 
AAAAATCAAA 
TATTCAAAGC 



CCTTTOCTGT 
GGOCSICCTGT 
TCCSUVCCAGA 
GCCTTCTGGA 
XCGAAaGM3C 
GBVGCTTGCCT 
TTTCAGTCAC 
OOAGCCCCTT 
GATGGGXTGA 
^CCTOTQQGAC 
GGOGSGCTOC 
AACX3TGGTTG 
TT0CTGTTC6 
CXGGCCTACA 
GGGGCftGGAG 
GGGAGCIAXA 
ATCTTGCTCC 
AGATCAATTG 
AGAGAAGACT 



AGAOCGAGCT 
CCACTTTTTA 
GTAGTTTA6T 
CCOTACTTTG 
ATGCAAAAAA 
GCTTGCATTC 
GTACTAOGCA 
ATACRGATAA 
AAAGTTAAAC 
CAAGTTCTAT 



41 

I 

GCGGCCTGGC 
GCX3GCCCTGG 
CGAOGOGCTG 
GTGirCCAGCC 
GTCC0CX3VGG 
ACTGTGCXITC 
GCACCCAGTT 
TqO CftGG GTC 
CCGCCTGCST 
GrrcftGTOCAT 
AOGCCAGAAG 
GGGGOCTGGG 



41 

I 

GCOG0GT6TC 
TCTTCAOCTT 
AGACGCACftG 
TSCTGOOCSU? 
'iXai'llSAOCTC 
!CGAAGQTQCT 
TGGOCAACAT 
TCCATTTCTT 
CIGAGGCCAA 
TCXaVTGTCTC 
CGTTCXX3USG 
TCAIGGCCCT 
GGCIGCCGAC 
TGG2V3)U31CA 
AOCTAGGGGA 
ATTCCTTCrr 
GGGQTGCCrA 



GAAOTTTCTT 
AGTACTAAGT 
TATOTCTTTT 
AATGTAGGTT 
ATATTTTGCT 
CTXGAGAAZG 
GAAGCSCAATT 
ACAOAGTTGG 
TCTTCTTTTC 
TTTTAOATAJV 



GAGGAAATTA 
CAAGTTATCI 
AAGOnAGGAG 
GGGTTTGftGA 
GATC3VCTGCr 
GGCATGCTCA 
CTGCRCATAT 
TCTCTTTGGT 
6ATTQTAAAA 
1CATACAAAC 
ATATOkCTTC 



TCTOCATTCC 
GOGTGGl^GGT 
TGQGQACGGC 
CSGGAGGACTC 



TGTTTTTCJCA 
GAATGQAAAA 
TTCAGOCTCr 
03CIC3GTCA 
AATTGft CTOC 

tcttccctgg 
aca5ccx30gc 

T^TTCTTCA 



AAQGATTTTC 
CAATAGCTTA 
ATACTAATGT 
TTTTCATTIT 
TTAXAATTAA 
CT 



51 
I 

GCTGCTGTGC 
6CBCCTOCTG 
CTGCX3GCGAT 
TQAATTCCAC 
CCAGGGGGTA 

GGGQACcrrrc 
GGGGTTTCIG 
CCOGOGQGCA 
CCTCCTCCTG 
GTGGOCCCGA 
CTGCCRGTTC 
AGACCT6TGG 



SI 
1 

GGGGCTGCTC 
CG6CCTGTGC 

crcjQCTQCcac 

OSGCCTCGGG 
CrCTCTOGCC 
GGCCTCAGTC 
GCAGCIGGTA 
OGIGGGCXTr 
CTGCTTOCCT 
CAGGGTGCXG 
GCTOACTCCA. 
CAT06ATCIT 
CTGCTGTCDC 
GCCTOCTGAG 
TGAGGACCTG 
TQCCATCCAC 
TTGOGCXTTTG 
T6GCTA0CTC 
CATATECCrCA 
QAC!GTTCCIG 
AAGGCTG6GC 
GCTGCASTAC 
G GTOC TGCaS 
GTGTGGGGTG 
CIU3C3ATGCAC 
CTCTGAGTGC 
TGATCACX3U3 
ATGGCIATTC 
ItSGTAOCTGB 
TTCAGACTGT 
GCTTC3UOTCC 
TTGAAGTTC6 
ATATTTCAAT 

GTGcrrrrcA 

TCTATATTCT 
AGTQGTATGC 
GACAGCTGGT 
ATGAATXACA 



Seq ID HD: Cia3 DHA Sequence 

nucleic Acid Access icn i]H_oo2203.2 

coding eequeuces 43,.35BB 



2760 
2B20 
2880 
294D 
3000 
3060 
3120 
3180 
3240 
3300 
333Q 



60 

120 

180 

240 

300 

360 

420 

460 

S40 

fioo 

€60 
720 
72S 



60 

12D 

180 

240 

300 

360 

420 

4B0 

540 

600 

660 

720 

780 

840 

900 

960 

loao 

1080 
1X40 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
I860 
1920 
1980 
2040 
2100 
21G0 
2220 
2280 
2322 



11 
I 



21 
I 



31 
I 



41 
I 



51 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 





60 


TTIAAATTGT 


120 


TTCAA13TGAA 


LBO 


ACIGGTTGGT 


240 


TCCTGTTOAC 


300 


TCCAAATGTT 


360 


GG6AACT0GA 


420 




480 


ACCTGCAACT 


540 


lAQTATTTAT 


600 


TATAGGOCCC 


660 


GTTTAACTTG 


720 


CCSiATATCSar 


780 


CTATTCAGCA 


840 


CGGIGAATCA 


900 


TATACTGA06 


960 


AAATTTAATA 


1020 


TGTGTCTGRT 


lOQD 


CATTGAAGGT 


1140 


C3VGTGCAGAT 


1300 


CTGGAfiTGOS 


1260 


CTTTQACCAA 


1320 


AATTTCTACT 


1380 


CCAQATAGTG 


1440 


AC3GTGA0CSUQ 


1500 


CACCATZACA 


1560 


□OAAGGAAGA 


1620 


TGAAGGCCCC 


1680 


CATCAACATG 


1740 


TGGAGGT6TA 


1800 


AATCTTOGC^ 


1860 


TQGCTATGGA 


1920 


AGTQBTTCAA 


19B0 


AGAAAAAATC 


Z040 


AAAQTTCAGA 


2100 


TGCnSftXGGA 


2160 




2220 


TATACAOGAGt 


2280 


AAACCXZTGGC 


2340 


TCCTTTGCAC 


2400 


CGGAClkAATA 


2460 


AACKTTTTCA 


2520 


TGATTTTTCA 


2580 


AACATGCCAG 


2640 


AAAGAl3Ad5AA 


2V00 


GAATCAG6GQ 


2760 


^EAATTTQOTC 


2020 


TACGAACATA 


2D80 


TTTTQAAOAT 


2940 


AQTAA0CATG 


3000 


GKTGTACCTA 


3060 


(3ATCCACTGr 


3120 


GCACAGOUUl 


3180 


AfiACQTTCAC 


3240 


TTTOGCATCA 


3300 


TAA0CCXGA6 


3360 


ACCTGA*!:G2U3 


3420 


CCTTTT6CTG 


3480 




3540 


CAGCAGACXn? 


3600 


CATOQATTTC 


3660 


ATTTTAAGAO 


3720 


CGGGQGGCAG 


3780 


GTARTCTTTA 


3840 


GAAATdCITC 


3900 


GQAAAAGI6T 


3960 




4020 


AAAACAAAAC 


4080 


TTOCAAOTOA 


4140 


AGGGCTGCOC 


4200 


GCTTAOIUTTA 


4260 


AGGCACA2WL 


4320 


AASCCCATC3C 


4380 


CAACAGTTCr 


4440 


GATGAfiTAAT 


4500 


C3TG9TACCAr 


4560 


ATCCCCTCCT 


4620 


OATATTAGGG 


4680 


ACTCCTTAAC 


4740 


GAflTGAATTT 


4800 


CftGGATTTCA 


4860 


ACCTTCTGTG 


4920 


AGGCAAGnr 


4980 


GCAATClGcn 


5040 
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TTTTAACAAG CACCCCAGTC ACTAGGATGC AGATOQAGCA CACTTTGAGA AACftCXZfiCCC SI 00 

ATTTCTACTT TTTGCACCTT ATTTTCTCTG TTCCTCSAGCC CCCACRTTCT CTAOGAGAAA 5160 

CTTAGATTAA AATTCACAGA CJiCTACATAT CTAAflGCTTT GACAAGTCCT TQACCTCTAT 5220 

AAflCTTCAGA GTCCTCATTA TAAAATG(3GA AGACTGAjSCT GGAGTTCAGC AGTGA1GCXT 5280 

TTTAGTTTTA AAAGTCTATG ATCTGATCTG QftCTTCX^TAT AAl'ACAAATA CACMTCCTC 5340 

CAAQRATTTQ ACTTGGAAAA CS 53 SX 

Seq XD HJQz C124 DMA Sequence 
Nucleic Acid Accession NM_031460 • 
Coding sequence; 103..1Z01 

I 11 21 31 41 51 

II 1111 

A^CAGGCGTT TGCGAGAGGA GATACQAQCT QGACGCCTGG CCCTTGCCTC CCACX3GGGTC 60 

ID CTAS5TCCAOC GCTCCCOOCO CCGGCTCCCC GCCTCXCCOQ CTATGTACOG ACCGC5GftGCC 120 

CG6C3C3GGCTC CCXSAGGGCAQ GGTC0G6GGC ^EGCQCXSGTGC CCAGCACCGT GCTCCTGCTG 180 

CTCGCCTACC TOeCTTACCT GGOGCTGGGC AOOGGOGTGT TCTGGACGCT GGftOOGCOSC 240 

GCGGCGCAGO ACTCX:ASC0G CAGCTTCCRG CGCGACAAGT GGGAGCTGTT GtraOAACTTC 300 

on ACGTGTCTGG ACOCSCCCoaC QCTGGACTCG CTGATCCX5GG ATGTCOTCCA AGCATACAAA 360 

AACGC3AGCCA GCCTCCTCRI3 CAAjCACCACC AQCATGOGGC GCTGGGAGCT GOTGOGCTCC 420 

TTCrrCTTTD CTGTGTCXaw: CATCACCAOC ATTGGCTATO QCAAOCTGAG 0C0C3U«»0Q 480 

ATOQCTQCCC GCCTCTTCTG CATCTTCTTT GCOCTTGTGG GGATCCCACT CAACCTOBTG 540 

GTGCTCAACC GACK5GGQCA TCTCATGCAG CAGGGAGXAA ACCACTGGGC CAGCAGGCTG fiOO 

^ QGGGGCADCT GGCAGGATCC TGACAAGGCG CSGSTGOCTOO CGGGCTCTGG CGCCOTOCTC 660 

TCGGGCXrCCC TGCTCTTOCT GCIGCTGCCA CSCGCTGCTCT TCTCCCACAT GGAGGGCTG6 720 

AGCTACACA3 AGGGCTTCTA CTTCGCCTTC ATCACGCTCA GCAOCGTGGG CTTGGGGGAC 7B0 

TACGTGATTG GAAT6AAODC CTOCCAGAfiQ TACCCACTGT GGTACAAlQAA CATGQTQTOC 840 

CTGTGGATCC TCTTTGaOAT GGCATGGCTG GCCTTGATCA TCAAACTCAT CCTCTOCCAG 90O 

CTOGRaACOC CAC3GGAGGC3T ATGTTCKTGC TGCCACCACR GCTCTAAGGA AGACTTC3WG 960 

jU TCGCAAAGCT GGA^sACAGGG ACCTGACCQG GAGCCAGAGT CCCACTCCCC ACAGCAAGGA 1020 

TGCmrcXM AOGGACCCAT 6GGAATCATA CAGCATCTGQ AACCTTCTGC TCAGOCR3CA 1080 

OGCTGTGGCA ASGACAGCXA 6TTATACTCC ATTCTTTGGT GGTOGTOCTC GQTAGCAAGA 1140 

CCCCTQATTT TAAGCTTTGC ACATGTCCRC CCAAACTAAA GACTACATTT ICCATCXaUIC 12 DO 

^ _ CTAGAGGCTG GGXGCAGCTA TATOATTAAT TCTGCCX3^T AGGGTATACA GAGACATGTC 1260 

CXGOGTOACA TGGGATGTGA CTTTOGG6T6 TOGGGGCASC ATGCCCTTCT CCOCX3«^rTC 1320 

CrTACTTTAG CGGGCTGCAA TGCCGCCGAT ATGATGGCTG GGAGCTCTOa CftGOCATACG 13BD 

GCAOCATQAA QTAQCGGCAA TGTTTGftGGG GCaVCAATTAiG ATAGGKA6AG TCTGGA*rCTC 1440 

TGATGATCAC AGAGOCMCC TAACAAAJDGQ AATATCAOOB ACXXMCTTT ATOZOaGAGA ISOO 
GAAATAAACA TCTATWVA 



40 
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Seq XD NOt C125 ECHA Seqaeixcs 
laticleic Acid AccesBltm #■ igK_0D4i54 
Coding Bequezices 309.. 1295 



45 1 IS. 21 31 41 51 

i I I I I I 

AAGGaCA<5AG GAGGGGCCCT TCCTGTCAGC TGGCTGGGAG CAGAGGTGGC TTTGTCTTTT 60 

eoOAAGAACT QGTTCrGTGQ AATTTGTQCT TATTTOOCRT CAAGSATCAA GQAOCTGCTC 120 

TOeGGCTACC TCAGOGCCCC ACAGGATGAG GQSCTGQTTT TCAGATGAGT TTTCTGCTTa 150 

DLf CCTOTCATCT GGATAGIIGTC TAAAAATTTG CAAACTSOCT TCnGTCAGT QTCTTGCTCA 240 

TTCTTCATGA CSlCTCSCTGAT ATGTCTCTCA GTICTGCrCAT CTGCXGCCIC TCCAGOCTTC 300 

TGCCAQAACA TTGCAiOGCGA CAlSrarcCAflG CACAGAACTG ACTGGCAGCA GGOGCI^aCTC 360 

CAOTAGTGQG AATTTQCTCC AGCACTTCAC GGACTGCAAG CGAGGCACTT GCTAACTCTX 420 

GGATAACAAG AOCTCTGOCA GAAGAACCAT QGCTTTQ0AA GGCGGAGTTC AGGCTGAGQA 480 

JD GATC5QGTGOG GTOCTCAGTG AOCCCCTOCC TCCXn«AACA TAGQAAACCC ACCIGGGCAG 540 

CCTATQGAAM GOACRATGGC ACAGGCKAGG CTCTGGGCTT QCCAOCXACC ACCXCmxCT €00 

AGCGCGftGAA CTTCAASCAA CTGCTGCTGC CAOCTGTGTA fFTGGGOGOTG CIGGGGGCIG 660 

GO CTGO OeCr OAACATCTGr GTCATTACCX: AGATCTGCAC GTCCC60CGS GCOCTQACCC 720 

iCA OCTCGGCOQT GTACACCC£A AACCTTGCTC TDOCTGAOCr GCTATATQCC TGCTCCCTGC 780 

OU CCCTGCTCAX CmCSUUTTAT GCOCAAiGGTa ATCACTGGCC CTTTGGOGAC TTOGOCTGOC 840 

eCCTGGTCCG CTTCCTCTTC TATGCCAACC TGGA£X3OCA0 CATCCTCTTC CTCAGCieCA 900 

TCAGCTTCCA GOGCTACCTG GGCATCTGCC AOCOGCTGGC COCXTDOacSU; AAA08TGGG6 960 

GCCXSGOSGGC: TQCCTOGCTA GTGTGTGTAG COGIGIOQCT GG0C3GTGACA AOCKAGTGCX: 1020 

TGCCCACAOC CATCTTOGCT GCCRCAGGCA TCCMOGTAA COGCACTGTC TGCTATGAOC 1080 

03 TCRGC0C3GCC TQOOCTGGGC AOCCACTATA TGOOCTATGG CaVMGCTCTC ACTGTCaTCG 1140 

GCTTCCTGCT GOOCTTTGCT GCXICIGCTGG OCTGCTACTG TCTCCIGGCC TGOCGCSCTGT 1200 

GCOQCCAGGA TGGCCXX30CA GAaCCrGTGO CCCAGGA60G GOQTGGCAaS G0GGC0C6CA 1260 

TGG OgiT gqT GGTGGCTGCT GCCTTOXSCCA TCAGCTTCCT GCClTOrCAC ATCACCAAGA 1320 

CAGOCTAOCT GQCAJGTGCGC TOQAC3QCCGQ GOCjrCCOCTG CACTOTATTG GAGGCCTTTG 13 HO 

CACSOBGCCTA CAAAGGCAOG CG6CC3STTTG CC3M3TGCCAA CAGCmocXQ GACOCCATCC 1440 

TCTTCTAISTT CSUrcCABAAO AAGTTCOQCC GGOGACCACA TQAGCTCCIA CAGAAACICA 1500 

CAGCXAAATG GCAGAGGCAG GGTCGCTGAG TCCTCCAGGT CCtQOlQQUaC CTTCATATTT 1560 

OCCATCGTGT GC33GQQCAa: AGOAflCCCCS^ CCAACCCCAA ACCATGOOGA GAATTAGAST 1620 

TCAGCTCAGC TGGGCA1GGA GTTAAGATCC CTCACftGGAC CCASAAGCTC ACCAAAAACT 1680 

ATTT CTTCRG GGCCTXCTCT GGCOCAGACC CrGTOGQCAT GGAGATGGAC AGACCrGGGC 1740 

CTGGCTCITS A6AGGTCCCA GTCnGCCATG GAQASCTCGG GAAA0CSU2AT TAABGTGCTC 1600 

ACA71AAATAC AG1»TGA0GT GTACT6XCAA AA ig32 



Sea ID HOs C126 DHA Sequence 
miclelc Acid AccesQlon #: NM_007197 
Coding sequence; 16.-1763 

1 11^ ?1 31 41 SI 

I I t ) I I 
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Seq ID VOt CX27 nsft Sequence 
NUclelc Acid Accesfiion #s hm_0 057 61.1 
Ooding aequences 25D..4956 



ACACQTCCAA OGCCAGCATG CAGCGCCCX3G GCCXJCCSGOCT GTQGCTGGTC CTGCAGGTGA 60 

TGGGCTOGTG CGCCQCCATC AGCTCCATGG ACATGOAQCQ CCCQGGC33AC QGCAAATGCC 12 0 

AGCCCATCOA GATCCCGATG TGCAAGGACft TCQGCTACAA CATGACTCGT ATGGOCAACC 180 

IGATGGGOCA CeAGRACCAS CGOGAGGCAG CaVTCCAOTT GCAOSAlQTTC GCXICCQCrGG 240 

D TOGAGTAOSG CTGCCftOGGC CACCTCCeCT TCTTCX7CGTG CTGGCT6TAC GCGCGGATGT 300 

GCACOGAjGCA GGTCTCTACC CCCATCCCOG CCTGCCGOGT CATGTGCX3AG CAGGCCOGGC 3 SO 

TCfiAOTGCTC CCCGATTATG GAGCftOTTCA ACTTCAAGTG (3C5CCGACTCC CrGGACTGOC 420 

GGAAACTCCC CAACAAGAAC GACCCCAACT AjCXrXGTGCAtr GQAGGCGOCC AACAAGGGCT 480 

CGGAGGAGCC CAOCOGGGGC TCGOGCCTOT TOOCGCCGCI GTrCCQGCCG CftGCQGCCCC 540 

ACAGC6CGCA GGAGCACCOG CTGAAGGACG GOGGCCCCGG GCGCGGC6GC TGCGACAACC 600 

CGGGCAAGTT CCACXaOGTG GAGAA6AGCG CGTCGTGC3t5C GCCGCTCTGC ADQCCDGGOG 660 

TQGACSTBTA CTQOAQCOaC OAGOACAAOC QCTTCGCAGT GGTCTGGCTG GCCATCTGOG 720 

C qgTG CTGTG CTTCTTCTCC AGCGCCTTCA COGTGCTCAC CTTCCTCATC OACCXITSaCCC 780 

GcrrccxaCTA cccooAscac cccatcatcx toctctcc3«p gtoctactgc gtctactccg 6 40 

TQaBCJ3W3CT CATCCX3CCTC TTOGCOSGOG CCGRGAGCM CGCCTGGGAC COGQACAGCG 900 

GOCaVSCTCTA TGTCSVTCCAG GAGGGACTTGG AQRQCftCCGG CTGCAOGCTG GTCTTCCTGG 960 

TCCTCTACTA CTTCGGCftTG GCCAGCTOGC TGTGGTGGGT GGrCCTCAOQ CTCACCTGGT 1020 

TCCTGGCOGC CGGCAAGAAG TGGGGCCACG AGGCCATCGA AGCCAACAGC AGCTACTTCC 1080 

AOCIGGCAGC CTGGQCCATC OOQGOGGTGA AGACCATCCT GATCCTGGTC ATQCGCftGGG 1140 

/X) TGGCXIGGGGA OSAGCTCAiCC GGGGTCTGCT ACGTOGGOUBI CATGGAOGTC AAOGOGCXCA 1200 

CO(3GCrTCGT GCTCATTCCC CTGGOCTGCT ACCTGGTCAT OOGCACGTOC TTCATCCTCT 1260 

CGGGCrrCGT GGCCCTSTTC CACATCCGGA GGGTQATGAA GACXSGGCGGC GA6AAGAC!OG 1320 

ACMGCTGQA GAAGdCATG GTGaTTATOQ GGCTCTTCTC TGT6CTGTAC ACGGTOCCGG 13 BO 

GCACCTGiTGT GATGGCCTGC TACITTTAOS AAGGCCTCAA CSITOGATTAC TGGAAGAICC 14.4.0 

Z-> TGGCX36GGCA GCACAAGTGC AAAATSAACA ACXJU3ACTAA AAOGCTGGAC TGCCTGAlJGG 15D0 

COGCCTCCAT OCCGGOOST6 GAGATCTTCA TGGTGAAGAT CTTTATGCTQ CTOGTGGTGG 1560 

GGATCACCAG CGGGATGTGG ATTTGOAjCCT CCAAGACTCT GCAGTCCTGG CAGCTlGGrGT 1620 

QCAGCX3aTAG QTTAAAGAAG AAQAGCOGGA GAAAACOGGC CAGOGTGATC ACCAGOQQTQ 1680 

GGATTTACAA AAAAGCGCftG CATCCCOUSA AAACTCAOCA CGGOAAATAT GAGATOCCT6 1740 

CCCAGTOGCC CACCTGCQTQ TGAACAGGGCT TGOAGGGAAG GGCACAGGOG CGCXXBGAGC 1000 

TAAGAT6TGG TGCTTTTCTT G6TTGTGTTT TTCTTTCTTC itcttCTTTT ■ rjriTTiTm' 1860 

ATAAAAGCAA AAGAOAAATA CATAAAAAAG TQTTTACC3CT GAAATTCAGG ATGCTGTGAT 1920 

AC&CTGAAAG GAAAAATGTA GTTAAAGGGT TTTGTTllOT TTTOGTTTTC CAGOGRftSQG 1980 

AAGCTCGTCC ABra AAOTAO CCTCTTGIOT AACIAAX1T6 TGGTAAAGTA GTXGATTCftfi 2040 

CCCXCAG2VAG AAAACTTTT6 TTTAGAGCCC TCCXSTAAATA TAC3VICTGTG TATTTGAGTT 2100 

GGCTTTGC3A CXXaTTTACA AATAAOAX3GA CAQATAACTG CTTTGCRAAT TCAAGAGCCT 2160 

OCCCTGGGTT AACAAATGftG CCATOCCCAG GGCCCACCCC CAGGRAGGCC ACftOTGCTGG 2220 

GOOGCATCCC TGCAGAGGAA AGACaGGACC COQaaCCQQC CTCACACGCC ASSOSHTrXG 2280 

GAGTTGCTTA AftATAGftdC TGGCCITCAC CAAT2USTCTC TCTGCSUUSZU:: AOAAACCTCC 3340 

H\J ATCAAACCTC ACATTTGTQA 2U:!TCAAAC3GA TOTGaUMCAC ATTTTTTTCT CTTTCCPlGlA 2400 

AAATAAAAAG AGAAACRAGT ATTTTGCTAT ATATRAAGAC AACAAAftQAA ATCTCCIAAC 2460 

AAAAGAACTA AGftGGCCCRG CCCTCAGAAA CCCTrCAGTQ CTACATTTTG TGGCITTTTA 2S20 

ATGQAAACCA AGGCAATGTT ATAGAOGTTT GGACTGATTT GTGGAAAOGA GGGGGGAAGA 2580 

GOGAGAAGGA ^TCATTCAAAA GTTACCCAAA QtaGCrTKrrO ACTCTTTCTA TTGTTAAACA 2640 

A ATCATTTC C ACAAAGAGAT CAGGAAGCAC XAGGTTGGOV GAOAaiClTT 6TCTAGTGTA 2700 

TTCTCIMAC AGIGOCAQBA AAGAGTGGTT TCTGOGTrOTG TATATTTGTA ATAXATGATA 27fiO 

TTTTTCATGC TCCACZATTT TATTAAAAAT AAAATAIGirr CTTTAAAAAA A 2811 



1 11 ZX 31 41 SI 

I I I 1 I I 

GCGAOSAGGA AACX3OTG0QQ GAGOGCOCAG QGCZETGCTGC OGOCAGOSCG GCTGCACAGG 60 

CTOCCGGAQC GAGOCTGOOG CX3CGCX3GOCC TCOCOGCTCT CCTTCCTOSQ OQAGCTOOGO 120 

GGATGGGGOG GCCQCGGQAG CCOGAGCQCG CGCAGGAACC GC0G00G006 CGGCXlGGOGr IBO 

CTGCQZTGCZC GCGOGOCTGA GCCX30GGT06 OCXSCOSOGGG OCCOieCiCOGG GGGCSBGCCCC 240 

_ CCCAGCCCGA IGGAOSTCTC CX2GGIW8GAAG GCGOOOOCQC OCSOCCCCGOG O0C06GAGG6 300 

OU CCACTGCCCC TGCTOGCCTA TCXGCTGGCA CTGGOGGCTC 006GCGGGG6 CGQGQA£3QAQ 360 

CXX3OTGTGGC G6TOGGAGCA AGCCATGGGA GOCATOSCGG CQftGCCSVOOA GGA0GGC3GTG 420 

TTTOTGGCGA GOSacaUSCTG CCTOQACCao CTGOAf^'ACA 6CX:T6GAGCA CAGCCICTOG 480 

aaCCTGTACC GOGAOCaUkGC GGGCAACTGC ACAGAGOOGG rCCTCGClGGC GCCCCCXSOCG 540 

OGGCCCGGGC CGSGGaGCAG CnCAGCAAG CXGCXGCIGC CCTAOOGCGA OGGGGOGGGC 600 

GGCCTC2QGGG QGCIOCrOCr CACCGGCTGO ACCTTOQACC GGGSOGCCTG CGAGGTGCGS 660 

CCaCTGGGCA AOCTGAGGOS CAACTCCCTG OGCAAOGGCA CCGAG^rGGT GTOGTGCCAC 720 

CCXSCAQOaCT C35ACCJQOOGQ CXSTGGTGTAC CGGGCGGQCC GGAACAACCG CtGGTACCTa 780 

GCX?GTGGCC6 CCACCZAlGGT 6CTGOCTGAG COGGAGAOGG CGAGCOOCTG CAACOCOGGG 840 

GCftTCiaGACC AOGAiCSbOtSaC CATGGOGCrC AAGGA£AiQ[3G AGGGGOGCS^ CCXOGCCAiCa 800 

CAGGAGCIGG GGCSGOCTOlA GCTGT6GQAG OGCGCX36GCA GCCTGCACIT OGTGGAGSGC 960 

^TTCTCTGGA AGGGCAGCAZ CTACTTOCXC TACTACOCCT ACAACTATAC OAQCGGOGCTr 1020 

GOC&CCGGCI GGCCCAGCAT GGC60GCATC GOGCAGAGCA CCGAGGTGCT GTTCCAGGGG 1080 

CAGGCATCCX: TCGACTGOGG 0C310GGCCAG OCQGACQGCC 6CCX3CCIGCI CCTCCOCTQC 1X40 

AGCC1A6TGG AGGCCCTGGA CIGTCTOGGOG G6AGTGTTCA QGGCSGGOOGC TGGAGAGGGC 12DD 

C&iGGAGOQGC GCTCCCOCAC GACCACGGOG CtCTSCCTCT TCAGAKTGAG TGAGATCCAa 1260 

GOGGSOGOCA AGAGGGTCA6 CTGQQACTTC AAGAOGGOCG AGAGCCACTG CAAAGAAGGG 1320 

GATCAACCara AAAGAGTCCA ACCAATCGCA TC31TCTA0CT TGATCCATTC GGACCIGAC3W 1380 

TCOGTTTATG GCAGCSTGGT AATGAACAGG ACTGTTTTAT TCTTGGGQAC TGGAGATGGC 1440 

CASTTACTTA ASGTTATTCT TGGTOnGAAT TTQACTTCAA ATTGTCCAGA OGTtrATCTAT 1500 

raUWTTAAAG AAGAGACACC TGTTTTCTAC AAACTGGrrC CTGATOCTGT GAAGAATATC 1S60 

TACATTTATC TAACAGCTGG GAAAGAOeiG AG6AGAATTC GTGTTC3CAAA CIGCAATAAA 1620 

CATAAATCCT GTTCGGAGTG TTTAACAGCC ACAGACCCTC ACK30GGTTG GTGCXaTTOG 1680 

CTA CAAAGGT GCACXriTCA AGGAGATTGT GTAiCATTQ^ AGAACTTAGA AAACTGGClG 1740 

GATATTTCGT CTOQAGCAAA AAAGTGOCCT AAAATTCAGA TAATTCGAA6 CAGTAAAOAA 1800 
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AAOACTACAG TOACTATGGT GG6AAGCTTC TCTCCAAGAC ACTCftAAGTG CATGGTGATUS XS&D 

AATGTGGACT CTAGCAGGGA GCTCTGCCAG AATTVAAAGTC AGCOCAAJDOG GAOCTGCACC X920 

TGTAfaCATCC CAACCAQAtSC AACCTACAAA GATGTTTCAG TTGTCRAOGT GATCTTCTCX: 1980 • 

TTCGGTTCTT GGAATTTATC AQACAGATTC AACITTACCA ACTGCTC3W7C ATTAAAAGAA 2 040 

TGCCCAGCAT GCGTAGAAAC TGGCTGOGOQ TGGTGTAAAA GTGCAAGAAG GTGTATCCAC 2100 

CCCTTCACAG CTTGOOACCC TTCTOATIAT GAGAGAAACC AGQAACABIG TOCAGTGQCT 2160 

GTCQA(3AAGA CATCAGGAGG AGGAAGACCC AAGOAGAACA AQGGGAACRG AACCAACCAG 2220 

aCTTTACaOQ TCTTCTACAT TAAGTCCATT GAGCCACAQA AAGTATCX3AC ATTAGGGAAA 2280 

AGCAAGGTGA TAGTAACGGG AGCAAACTTT ACCCGCX3CAT CGAACATCAC AATGATCCIG 2340 

AAAGOAACCA GTACCTGTGA TAAGQATGTG ATACAQGTTA GCCATGTGCT AAATGACACC 2400 

CACATGAAAT TCTCXCrrCC A'TCAAGGOGG AAA6AAA1GA ACkSAXmCTG tTATCCAfiTTT 2460 

aATOGToaaA actoctcttc tgtgggatcc ttatcctrca ttgctctgcc acattcttcc 2520 

CTTATATTTC CrTOCTACCAC CTGGATCAGT GGTOGMUVA ATATAACCAT GATGQGCAfiA 2 580 

AATTTTGATG TAATTQACAA CTTAATCATT TCACATQAAT TAAAAGGAAA CATAAATGTC 2640 

TCTGAATATT GTGTGGGGAC TIACTG06GG TTTTTftGCCC CCAGTTTAAA GAlGIIC3U\AA 2700 

GTGOaCAGQA ATQTCACTOT aAAGCTQAOA GTACAAiQACA CCTACTTGOA TTQTGGAAOC 2760 

CTGCAGTATC GGGASGACKC CftGATTCAOS GOGTATOGGG TOGAAiraCGA eGTSGACAiCA 2820 

GAACTOOAACS TGAAAATTCA AAAAQAAAAT QACAACTTCA ATATTTOCAA AAAAOACATT 2880 

GAAATEACTC TCTTCCATGG GGAAAATGGG CAATTAAATT GCAGTriTGA AAATATTACT 2940 

AGAAATCSU^ ATCTTACCAC CATCCTTTGC AAAATTAAAiG GCATCAAGAC TGCAAGCACC 3000 

ATTGCCAACT CTTCXAAGAA A6TTGQGGTC AAGCTOGGAA AGCTlGGnSCT ClAGGTCSAfi 3060 

CAflGAQTGAQ TTOCTTCCAC ATQGTATTTT CTOATTOTSC T C OCmTClV' GCTftGTGATT 3120 

GTCATTTTTG CX3GC0GTOO6 GGTGACCAGG CACAAATOGA AG6AGCTGAG TCX?CAAACAG 3180 

AGTCAACAAC TAGAATTGCT GGAAAGCGAG CTCCGGAAAfi AQATAaSTOA COaCTTTGCT 3240 

GAGCTGCAGA TGGATAAATT GGATGTG^TT GATAOTTTTG GAACTGTTCC CTTOCITGAC 3300 

TACAAACATT TTGCTClfiAe AAClTTCTTC CdGSMStCRG OXGOCTTICS^ CCACATCTTC 3360 

ACTGAASATA TQCATAACAG AGACGGC&AC GACAAOftATG AAAGTCTCAC AGCTTTGGAT 3420 

GCOCTAATCT GTAATAAAAG CTTTCTTGTT ACTGTCATCC ACAOCXTTEGA AAAGCA6AAG 3480 

AACTTTTCXG TGAAGGACAG GTGTCTGTTT GCCTCCTTOC TAAIXMJTQC ACTQCAAACC 3540 

AAGCTOCTCT AGCTGACCAG CATCCTABAG GTGCTGACCA GGGACTTGAT GGAACAGTGT 3600 

AGOTAACAIXSC: AGCCGAAACT CArr6ClX%GA GGCSUGGGAlGT COSTOOTCQA AAAACTCCTC 3660 

ACAAACTOGA TGTCOGTCTG OCT T T C T G GA TTTCTC08GG AGACTGTCGG AGAGOCCTTC 3720 

TATTTOCTOa TGAQBACTCT GAACCAGAAA A^TAACAAGO GTCCCCTTBGA TGTAATCACT 3780 

TGCAAAGCCC TGTACACACT TAATGAAGAC TGOCTGTTGT GGCAGGTTCC GGAATTCA6T 3840 

J J ACrarGGCAT TAAACQTCjer CTTTGAAAAA ATCOGGGAAA ACOAOASTGC ASATOTCTOT 3900 

CGGAATATTT CAGTCAATCT TCTOGACTGT eACAOCATTG GCCAA6CGAA AGAAAAlGKIT 3960 

TTGCAAGCAT TCTTAAGCRA AAATQGCTCT CCTTATGGAC TTCAfiCTTAA TGAAATTQCTT 4020 

CTTGAGCTTC AAATGGGCAC AOGACAQAAA GAACTTCTGG ACATOGACAG TTCX3CCGT6 40 SO 

ATTGTTGAAG ATGOAATCAC CAAGCTAAAC ACCATTGGCC ACTATGAtaAT ATCAAATGQA 4140 

TCCACTATAA AAGTCTTTAA GAAiGATAGCA AATTTTACTT GAGATGTGGA GTACICGGAT 4200 

GAOCACTGCX: ATlTSA^rrrr ACX!AGA*ETCG GAAGGATECC AAGSATOIGCA ABGAAAOAaA 4260 

CATCGAGGGA AGCACAAGTT CAAAGTAAAA GAAATGTATC TG»C3»AGCI GCIGXCSACC 4320 

AAQGTGGCAA TTCATTCTGT GCTTGAAAAA CTTTTZAjGAA GamTOOAO TTTACCCAAC 4380 

AGC3W3AQCTC CATTTGCTAT AAAATACTTT TTTGACTTTT TGGAOGOCCA GGCTGAAAAC 4440 

AAAAAAATCA CROATCCTOA CGTCGTACAT ATTTOQAAAA CAAACN3CCT TCCTCTTOGC 4500 

TTCTOGGTAA ACATCCTGAA GAAOOCTCAG TTTGTCTTTG ACATTAAGAA GACAOCACA^ 4560 

ATAGACGGCr GTTTQTCAOT GATTQCXlCSiS GCSLTTCATTO KTOdO^TTC TCTCACTVQAO 4620 

C3G<AACrA6 G6AAGGAAGC AOCAACTAAT AAGCTTCTCT ATGOC2UU3GA TATOCCAAOC 4680 

TMAAAGAAG AAGXAAAATC TTATTACAAA GCAATCftfiOG ATTTQCCTCC ATTQTCATOC 4740 

TCRGAAATGG AAGAATTTTT AACTCAGGAA TCTAAGAAAC ATGAAAATGA ATTTAATGAA 4800 

GAAjSTGGCCT XGACAG2\AAT TTACAAATAC ATGGXA2\AAT AXTTTSaKF^ Q&TTCXAAAT 4860 

AAACTTAQAAA GASAAOGAiGG GCTGGAAGAA GCTIIZAGAAAC AACTCTTGCA TSTAftAAGTC 4920 

TTATTTGATG AAAAGAAGAA ATGCAAGTOG ATTGTAAGCAC TCTGGGGCCT GGCTTAAtdT 4980 

GSCAA AarTC TTCRGACBAC TTt3GGAG{:!AA AATOaiZTGCT TGAOCXACTC TG7GTCX3TTA 5040 

ATTTGTTGTT TGCACKTAOG TTCCACTTTG GGCACTGTCT TTTXAAGAGA CCAAGGCS^ SlOO 

TGCACASCTT TTTAGAAAiQCA A 53.21 
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aeg XU nOi ci20 dbza £«quenjC6 
Naclelc Acid Accession #: 19H_002X85.1 
Ondlnj sequences 23.. 1402 '~ 



X 11 21 31 41 51 

I I I I t 1 

CICTCICTCT ar CICT CTCA GAATGACftAT TCTACGTACR ACTTTTGQCA TGGTTTTTTC €0 

TTTACITGAA 6T0GTTTCIG GAGAAAGTGG CTATGCICAA AATGGAGACT TOQAAGATGC 120 

ASAACTGOAT GACTACTCAT TCTCATGCTA TAGOCAGTTG GAAGIGAAT6 GaTGSaUQCA 180 

TTCACTGACC TGTGCITTTG AGGACCCAGA XGTCRACACC ACCAATCIQQ AKTrOTGRRAT 340 

A TOTOg aGCC CTCGTGGiAGa TAAAGTGCCT OAATTTCAGG AAACTACAAS AGATATOrTT 300 

CaVTOSAGKCA AAGAAATTCT TACTGATTGG AAAGAGCZAAT ATATGTGTQA AGOTTGGAGA 360 

AAAGAaTCTA ACCTGCAAAA AAATAGACCT AAOCACIATA GOTtAAACCIG AGGCTCCTTT 420 

TGA OCffG AGT GTCATCTATC GGOAAQQAGC CAATOACTrT GTG9TGACAT TTAATACATC 4B0 

ACACTTQ(aA AAGAAGTATG TAAAAGTTTT AATGCATGAl! GIMCTTACC iSCCAGGAAAA 54 0 

GGA!EGAAAAC AAAXGGACBC ATOTQAATTT ATCCAGGACA AAQCTGACAC TCCTGCaGAG 6D0 

AAAGCTOCAA CC33GCAGCAA TGTATGAGAT TAAAGTTCGA rtCCAlCOCaXS ATGACIATTT 660 

/j TAAftOGCTTC TGCS^fiTGAAT GGAOTCCAAG TrATTACTrC AGAACTOCM AGATCAATAA 720 

TAflCrcAI36G 6AGATGGATC CTATCTTACT AACSCATOIGC ATTTTGAGTr TTrrCTCTGT 700 

GGCTCTOTTG QTCATCTTG6 CCTGrGTSTT ATGGAAAAAA AI33ATTAAGC CTATCGTATQ 840 

GOCCAGICTC OCGGATCATA AGAAGACTCT GGAACATCTT' TGTAAGAAAC CAAGAAAAAA 90O 

^ TTTAAATGIG AGTTTCAATC CTGAA2UGTTT CCTGGAClGC CAGATTCATA GOGTGGATGA 960 

OU CATXCAAGCr AOAGATGAAG TGGAAiGQTTT TCTGCAAGAT AOGTTTCCTC AGCAACTAGA 1020 

AOAAICTGAG AAGCRSAGGC TTGGAGGGOA TGTGCAGAGC COCAACXGCC CATCTGAGQtA 10 BO 

TGTAGTCGTC ACTCCAGAAA QCrTTTGOAAG AGATICATCC CTCACAT6CC TGGCTGGGAA 1140 

TGTFCASTGCA TGTQAOGCCC CTATTCTCXC CTClTCCAlGG TCGCXAQACT GCAGGGaOAia 1200 

TGGCAAGAAT OOGCCXCATG O^STACCMSGA CCTCCTOCTT AGCCTTGGGA CCACAAACSVO 1260 
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cAGscrracxx: octccatttt ctctocaatc tggaatcctg acattgaacc cAOTTGcrcsv 1320 

GGGTCAGCOC ATTCTTACTT CCCTGGQATC AAATCAAGAA GAAfiCATATG TCACCATGTC 1380 

CAGCTTCTAC CAAAACCAGT GAAGTGTAAG AAACCCAGAC TGAACTTACC OTOACSCGACA 1440 

AAGATQATTT AAAAC3GGAAG TCTAGAGTTC CTAOTCTCCC TC3\Cfti3CACA GftGAAGACRA 1500 

5 AATTAGCAAA ACCOCACTAC ACAGTCTGCA AGATTCTG2UV ACATTGCTTT GAOCACTCTT 1560 

CCTGAGTTCA GTGGCACTCA ACATOAQTCA AGAQCATCCT OCTTCTACCA TGTGGATTTG 1620 

GTCACRAGOT TTAAGGTGAC CCftATGATTC AGCTATTT 1658 

Seq ID NO I CI2 9 DMA Setjuence 
10 HUclelc Acid Accession ft: 1IM_002722-1 
Gbding sequencer 15.. 302 

1 11 21 31 41 Bl 

I I 1 I 1 I 

ID ACrCTQGAGT CXIGGATGaCT OCCIGCAOGCC TXTTGCCTTCTC CClGCTGCTC CrGTOCACCT 60 

G0&TG6CTCT CTTACTACAG CCAmTSCTGG OTOOCCAGGG AiQCXCCACTO QAQCCAGTOT ISO 

ACCCAGOGGA CAATGCCACA CCAGA6CAGA TGGCCXIAGTA TGCAGCTGAT CXCGGTAlGAT IBO 

ACATCAACAT GCTGACCAGG CCTAG(?TATG GGAAAAGACA CAAAOAGOAC; AOSCTOQOCT 240 

TCrCQOAtraa QOGGTCXXICG CATCGCTGCTG TCCCCAGQGA GCTCAGOOCG CTGGACTTAT 300 

20 AATQCCACCT TCTGTCTCCT AOGACTCCRT GAGCAGCGCC AQCOCftOCTC TCOCCTCTGC 3G0 

ACXXTTGQCr CTQGGCZAAAQ CZTGCTCCCT GCTCCCACAC AGGCTO^TA AAGCAASTCA 420 

A2USCC 425 

a©q ID NOr C130 IWEA Sequence 
25 Nucleic Acid Accession im_032S45.1 
Coding sequence i 47..71B 

1 11 21 31 41 51 

11)111 

AAACIGATCT TCAAOXSCACr AAGAGAAGGA GACTCTCAAA CCAAAAATGA CCTGGAGGCA 60 

CCATGTCAGG CTTCTOTTTA OOOTCaGTTT GOCATTACAO ATCATCAATT TSGQAAACIUB 120 

CTKTQUUUSA GAK^AftACATA ACGGOGGTAG AiGAOGAAGTC ACCAAG6TTG CGACTCAGAA XBO 

GCACOGACA6 TCACX3GCTCA ACTGGACCTC C3W3TCATTTC GGAGAOOTOBl CTOGOAaCSaC 240 

CXaAOaQCTOQ GGOCOGGAGG W30CGCTCCC CTACTCXXJSG GCTTTOQGAG AGGGTGOGTC 300 

C60GCGGCC6 OGCTGCTGCA GGAAOSGCXSQ TACXrXGCGTQ CrraOOCAGCT TCTQCGTQTB 360 

CSOQGQQCCftC ^rrCACCGGOC GCTACTGOGA GCATGAOCAG AGG0GCAGT6 AATGCGGOSC 420 

CCTGGAGCAC OGAGCCTGGA CX:CTCC3GOl3C CT(3CCACC3C TGCAGGTGCA TCTTCGGGGC 480 

CCTGtyiCJTGC CTCCCCCTCC AGACXK:CIGA COGCTGTGBC OOGAAAGACT TCXrrGGCCrC 540 

CCAOQCTCAC GGGCOGAGCG CCQOGGOOQC GCCCAQCCTG CTACTCTTGC TQCCCTQCGC 600 

ACTCCIGCAC OGCCTCCTGC GCCOGGATGC GCCGG06GAC OdOGGTOGC TOGTOCCTTC 660 

OGTOCTCCA0 OGGGAGCSOC GCCGCTQCQS AAQQCCSSaOA CTTGGGCASC GCCTTTAKIT 730 

VTClSUItSTTG TAAATAAIZAG ATGTGmAS TTTACOSTAA GCTGAAGCAC IGQGfSGKKCh 780 

TTFTTATTGG GTAATAAATA TTTTCATQAA AGGQCCSUAA AAAJffiAAAAA AAAAAAAAAA 640 

AAAAAA 846 

Seq TD VOs C131 DlIA Sequence 
nucleic Add Accession ft: 17K_Q 0653 3.1 
Coding sequences 72.. 467 

1 11 ai 31 41 SI 

I I I I I I 

AOGGAGAGAG GGAGGGGAGG AAATTGGAGA COOCftGCACC CCCTTGCTCA CTCTCTTGCT 60 

CACAQTCCRC (3ATQQCCOBQ TCXXTTGdCGl* GCCTTGGTGT CATCATCTTG CTGTCTGCCT 120 

TCTCOQGACC TGGTGTCAGG GGTGGTCCXA TGCOCAAGCT GGCTCJACCGG AAgCTqT O TP IBO 

CGGACCAGGA GTGCAGGCAC CCTATCTOCA TOGCIGTGGC CCITCftGGAC TACAIGGCCX: 240 

COQACTGOOO ATTCCTQACC AXTCACCQGO GCCAAGTOGT OEATQTCITC TGCAnQCTGA 300 

AOSaCXJSTGG GOQ8CTCTTC TOGGGASOCA GCX3ITC2IGG6 AGATTACXAT GGAGATCTOQ 360 

CTGCTOGCCT GGGCTATTTC CCCAQfTAOCA TTGTCC33AIGA OGACCAdAQC CIOAAACCTG 420 

GCAAAGTGGA TGTGAAGACA GACATlATOOG ATTTCTACT6 CCAGTGAGGT CAGOCTACOG 4S0 

CTGGCOCTOC CGTTTCCCX^T CCTTGCSOTIT ATGCAAATAC AATCftJSCX^CA 6TQCAAAC 53 B 

Seq ID NO I C132 UNA Sequence 
KUdelc Acid Accession fts AB064272 
CiDding sequences 1..708 

1 11 21 31 41 SL 

\ I I I I I ' 

ATGACACAA6 TCACAGAAAA GTCX»CAGAA CACCCAGAAA AGAOCACGTC AAGCACAGAG 60 

AAAACCSU:^ GAACCCCAGA AAAGCCTACaS CZATACICAG AGAAGaCC*AT ATGC!ACC2VAA 120 

GGQAAAAACA CACCAGTOCC AGAAAAGCCT ACAGAAAACC TGGG6AACAC CACACTGACC 180 

ACTSAGACCA TAAAAOCCCC AOTAAAOTCC ACAGAAAACC CAQAAAAAAC AGC1U3C9U9ZC 240 

AOVAAQACXA TAAAACCTTC AGTCAAOSTC ACAGG2USACA AATCTCTCAC TACIACCTCT 3 DO 

TCTCATCTAA ATAAAACTGA AGTTACTCAT CftGGTGCOCA CTGGTTCTTT CRCCCTCATT 360 

ACATCIAGAA CGAAGCTGAO TTCTATCACA TC3VGAAGCCA CAGGAAACGA GAGOCATCXa. 420 

TAjQCTCAATA AAQATGGCTC ACAGAAAGQT ATOCAOGCTG GACAGAIGOQ AQASAATGAT 4 BO 

TCKTTOCCTS GATGGGCCAT IVOTTATTGro 6TOCDG6XGG CTG3GATTCT OCTCCTGGTB 540 

TTCCTTGOCC TOATCTTCTT GGTCICCTAT ATGATGOGQA CACGQQCSCAC ACTAACOCA6 600 

AACACCC»6T ACAATGAXGC AGAGOATGAG GGTGGCCOCA ATTCCXACCC GGTCTACCIG 660 

ATGGnSChaC AGAATCTTGO CATGGGCCAB ATCCCZTTOCC QuaaQIOA 708 

Seq ID MOs C133 Tasm. Sequence 
Nucleic Acid Accession ftt NH_D8D870.L 
Coding sequence t 3 . . 710 
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10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



1 
I 

AGATGflCACA. 
AOAAAACCAC 
AA6GGAAAAA 
CSCftCTQAGAC 
TCACAAAGAC 
CTTCTCATCT 
TTACATCTAG 
CATACCTCAA 
ATTCRTTCCC 
TGTTOCTTOC3 
AfiAACACCCA. 
TGATGGAGCA. 
GGCBCCX3U3C 
AATTTCTATCr 
CTTCaTCTGT 
AGGGGACAflA 
CTGAtSAATQA 
TGftGT CCTAA 
GGmATGGG 
TTATTTCCAT 
AGOACCCTTG 
AAGAA0AAAA 
CChGAAGAAA 
TnUSTTTTCC 
GGAC&OGCAA 
GACCTTCCCT 
OTAGCTTTAT 
OCrXGACCCC 
ATTTTCTCTC 
' AQGA(3AATGA 
AT6TT 



I 

AGTCACAGAA 
AAGAACCCCA 
CACAOCA6TC 
CATAAA7USCC 
TATAAAACCT 
AAATAAAACT 
AAC6AAGCTG 
TAAAGATQGC 
TGCATGGGCC 
CCTfiATCTTC 
GTACAATGAT 

gcagaatctt 

CCTQGCTCTT 
TGCX3GGACTA 
TCTTQIRAACT 
GAAGTJ^AGAA 
AAA^iGTGTTT 
ASGAAGGACA 
GAAAaOGAQa 
TCACTATTAC 

QAGAATTTAT 
TCAtAAftlAT 
TAUhTQTCTA 
GGA<3SGGATT 
OAlrrOOTQTC 
CTCGTAAAAG 
AGATTCAAGT 
TCTCTTCTXA 
TQATGATAOT 



21 

I 

AAGTCCACAG 
OAAAAGCCTA 
CCAGAAAAGC 
CCA6TAAA6T 
TCAGTCAAGG 
OAAGTTACTC 
AGTTCTATCA 
TCACnSAAAG 
ATAGTTATT6 
TTOaTCTCCT 
GCAGAGGATG 
GGCAIGGGCC 
CCATGCTCTG 
CAG6AASGGC 
QGTTGGGGAA 
TGAATAATAC 
GATOGACAT6 
GGavGCCTtAT 
GACTC3AGGGC 
TEAAGAGXTT 
CATTTTTTTA 
TTOCTTCTCC 
CTCTCATCXA 
CAltTFQOACGC 
TTTATTTSGC 
TCAGCATTTA 
TTAGCCATCr 
TTTCCTCCTT 
TOGOCATITC 



1 

AACACCCAGA 
OGCTATACTC 
CTACAQAAAA 
CCACASAAAA 
TCACAGGAGA 
ATCAGGTGCC 
CATCA6AAGC 
QTATCXACOC 
TGGTCCTGGT 
ATATGATOCQ 
AGGGTGGCCC 
AGATCCCTTC 

AfiAGAATACT 
TGAOGTGATA 
GAGCAGACAT 
TTGTGGGGGC 
AESGCAATGCC 
AGAGTCTCTG 
GTGTGTAAAC 
ATQAAAAAAA 
ACICTCTOCA 
CATOOTTQCT 
CCTGTIGGIT 
CAGCRGTCTC 
TTTTCCTOTC 
TCTCIACTGT 
QTAOQCATTT 
ACCTTATfAC 
CTATTGAOCT 



41 
I 

AAAdQACCACG 
AGAOAAOACC 
CCTGQGGAAC 
CCCABAAAAA 
CAAATCTCTG 
CACrG<3TTCT 
CACAGGAAAC 
TGGACn^TG 
GGCHGTCaTT 
OACAOGCOGC 
CAATTCCTAC 
GGCAGGGT6A 
GGATGAGGAA 
GACQCTTTACC 
AGCAAGGAGG 
TCrCTQTAGA 
ACCAA1X3C3UG 
CCAQACTGAC 
GGTITCAGQA 
AOaCTTCATCT 
AAAACAAAAA 
TGCCCTaOAG 
TOCTCTTCCr 
TGGCTTGCTG 
ACCCACTGAT 
TCTTCCACCA 
CCCCATTCTC 
GATCTGTGrG 



51 
i 

TCAACCACAG 
ATATGCACCA 
ACCACACTGA 
ACAGCAaCAG 
ACTACTACCT 
TTCAOCCTCy^ 
GAQAGCCATC 
GGAGAGAATG 
CTCCTCCTGQ 
ACACTAACCC 

caaoTczTACc 

TCTT06AGTA 
CCGQACTCAC 
AGTATTAAOC 
GTSTAAGTTT 
AGGTAATOGT 
AACACIGC3U: 
TTGTGAGTG6 
CAGCATTATG 
CTGAGTTCTC 
AAACGGATCC 
AAAAAAAAOT* 
COCAAATCCC 
QGTTC3TGaQT 
CTOCACCCCA 
AAAGCCAfiCT 
TCTCCTCCCA 
TGTTTXCXGG 



TTTTTAIAAT AAAGXATAAC 



6eq ID MOs 0134 SNA Sequence 
HUdelc Acid AcceBBion #« 
Codlag aequence s l . . 10674 



FGEHBSH preOioted 



1 
1 

ATGTGGOCTC 
TTTCAGCAlGA 
GGGGCCCCCa 
GTGOAGOGGC 
06CCTGGA<3C 
ASCORGCTCA 
OGOGTGGOCA 
TCCACC5CGCC 
TCC1ACOGA6 
CTXCArCGCTA 
GGGGQAGACC 
TTTGGCATAT 
SAaCACTQTT 
CATGAAGATC 
TGTGATGAAG 

ACAGCTTGCC 
ATTCCATGTC 
GTCTOC^AGAS 
CTGAAGCCTC 
GCCTGTCSGGO 
CTACCX3UVTO 
CT0CX30CAGC 
ACATOTTTOQ 
CAAiGGAAACA 
TXTCAGA-fGC 
TTTGGQAOGA 
ATGCTQA^SAT 
GAOGTGGAGG 
CAAQATTCTG 
OTGTCAGTOC 
GCIATCGTAT 
AAG6TTATT8 
GTCrCGXSAGA 
aCIGAATTOa 
ATAGTACAGT 
GTCnTAAAIUS 
ACTCCAGATA 
GAAOGGTCIA 
ACC3^CTGAAT 
GAGATGITCr 
GCAT!m3AGIA 



11 
I 

GCCTGGOCTT 
TGTCCCOGTC 
GGASTATOCC 
TOGGOCAGGC 
TTGTCTTCCT 
TGTTOGTCOG 
tCCGTGACCJTT 
G0GC5GOSCCA 
GTGGGGGCAC 
GAOAAAACTC 
CXAGACCAAT 
GGCAA£U3aAA 
ACCTGCTACA 
TAOCTTCTGG 
6CAAGGAGT6 
AGTGCATCTG 
CATOGGGGAC 
CTGATTGAAAA 
AGOGATACS^G 
COGAAAATOS 
ICCOATGTCA 
GTTTGTG£3TC 
CGAAACA^IGG 
TTQCCTOTGA 
GCCAGTGG6A 
CCAAAjGATOr 
TCTGCTATGT 
GTACCACTTC 
CTCCTCAAAT 
OCAATGTTAC 
AC3GTTCRTCC 
ACACOGCAAC 
ATGCAGAACX: 
AGGTACAXQC 
TCATTACCA6 
AXACAGCCAC 

ATACTGOAGT 
CTGACAAGTA 
GGOCAGACTG 
ACAAAGCAGC 
GQACCCTGGQ 



21 
1 

TTCJiTGCTGG 
GCX3CAATTTC 
OGCX3CCGCOC 
GTXCOGGOGA 
OGTGGATGAT 
CAAGCTGCXe 
CTCGTCXAAG 
GCACAAGTGC 
CTACACCAAO 
AACAAAA6TT 
TGCAGCGTCA 
CTTT03AGAG 
CAGTniGAA 




GSOQT^OGQC 
TGGTCCAGOG 
TCOSACTTCC 
AACTACGTG6 
GOGCTGCICC 
GGOGOCTTCC 
GTATTTCTGA 
CTGGGAGATT 
CTGAATGACA 
aAATTTGZU3G 



CTGXGA0C3GA 
TGAAAAGGGO 
ATACAAACCT 
TCACACCTCT 
GGCATCTGGC 
TTACTTTATC 
OCCTGGATTT 
CGGTTCM3AG 

7GAAGGGTAC 
70GGCCAQAA 
CA TCATA TOC 
AAtJTl'GCCGC 
TGGAAAATOG 
CAACTGTCCT 
CTGGCAGATT 
AGCTTTCACC 
TGAOCTATC3C 
ACCTGTCATA 
OGCAAGCTGG 
AZifiTCATACA 
TGACCX32TCA 
TGAAATTCCA 

caactgtaca 
ttattgtgct 
tgccaaaaaa 
tgqttgtgat 
aaaaai:ggtc 



TATTAGGGGA 
GAAOGCrCAC 
tXaCCTGGAA 
CAGACCTGT6 
CAAAACACTTT 
GATCTTGTGG 

TGTTCTACAA 
AGACXAGAAQ 
COCCQGTGTO 
GOOCACAACrr 
CAAGGOTTCA 
AATOTCGGAG 
AASOACATA6 
OCSlAObGCTA 
OCRCCTXACC 
OGCAACCAGG 
GACTGGTGCA 
GATGAGCCTC 
CAAGGAGAOC 
6GCAATAACA 
TTCACACCTG 
TTAACTTGCrr 
TATGAAiSATG 
OGTTTIGCAA 
GACACAGATC 
CCATCnTfTT 



^DGGGGGAAGT 
CXSJPSOTOOC 
TGC06CG0GX 
TGC3kAGA<3AT 
AOCAAGCX38C 
TCACTGATOG 
CAGQAGTGGA 
TGGCirrCCAC 
CTTTftGCTCX3 
TGGTCCACXG 

AAGGTCireCA 
CAGGAGGAAT 
GCACaVTCCCC 
AACXTGTCCA 
GCAACAACCA 
GAAGCAGCAS? 
OAGTAAOAAC 
GGGAAATGTI 
GCAGTIISUXAA 
TGGASOGGCA 
GTV3GCAAGCA 
TTTTATCTQG 
TTCAOGCAGC 
AGGCTAAaAC 
AAGACAACTC 
TTXTCCCAAT 
OCAGCTGCAT 
GATCTOC3UCC 
AOTT CTCflGA 
TTTTCtZCTCA 
GGACATGTGA 
TAAAtGGGGA 
TGGAGGGCTA 
GOGTCTGGAA 
ACaVGGGQTT 
7GAT6AAGAA 
GTAGrGATGC 



51 
1 

CTGOGCGACC 
GAOCGCGCX:C 
GGGOAOCAGA 
GCTCAGCGAG 
CSUU^XTCGGC 
CAOGGCCAOG 
CGATTACATC 
<XXTGCCATC 
GCAAATTCn 
ATATTCCAAT 
GATCTTCaCT 
CCCAAAGGAC? 
OOGGGCATTG 

6AiQU!AiCACA 
GTATGAATGC 
CAGCAGTTGC 
TGAAGACTGT 

CTTCAATOCA 
C3VTCTTATGT 
ATGTCCTCAT 
AXATAAGACA 
(XSnCACTTGT 

crcxvcchcc 

G12CA3CCAAA 
ASTCAAAGAA 
IGTGTGTAAA 



TGGTGAAAAG 
TGGAGATQTT 
TTTCCATATC 
TCCCGTCCAQ 



AGGGGAGACT 
TATCCATATT 
TITTATATGC 
TQATTTCSiCA 
ACCAACATAT 
CAAOTCCTTT 
GTTTTCTGAA 
AGA)3SACA!rr 



60 

120 

180 

240 

300 

360 

420 

460 

54 G 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1360 

1440 

1500 

1560 

1620 

ifiao 

1740 
IBOO 
1BD5 



60 

12 D 

180 

240 

3O0 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

loao 

1140 
120D 
1260 
1320 
1380 
1440 
15D0 
1560 
1620 
1680 
1740 

leoo 

1B60 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
24O0 
2460 
2520 
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GACTGCftGAC TGGAOOACSAA CClGACCriAA AAATATTGCX: TAGAATAUA TTATGTU^TAT 25B0 

BftftAATOacr TTGCAATTGG ACCAGGTGGC TGGGGTeCflG CTAATAGGCT GQATTACTCT 2G40 

TA05ATGACT TCCTGGRCAC TGTGCAACAA ACAGCCACAA QCATCGGCAA TGCXaAGTCC 2700 

TCRCQQATTA AAACSAAGTGC OCCATTATCT QACTATAAAA TTAAGTTAAT TTTTAACATC 27fi0 

5 ACAGCTA6T6 TGCCATTACC CaSATGAAAGA AATOATACCC TTGAAT60SA AAATCAGCAA 3020 

CGACrCCTTC AQACATTOGA AACTATCACA AATAAACTGA AAAOGACTCT CAACAAAGAC 28B0 

CCCATC5TATT CCTTTCAGCT TGCATCA6AA ATACTTATAG COGACAiSCAA TTCATTAGAA 2940 

ACAAAAAAGG CTTOCCCCTT CTGCAGACCA GGCTCftGTGC TGAGAGGGCX3 TATGTGTGTC 3000 

AATTGCCCTT TGSGfiACCTA TTATAATCTG OAACATTTCA CGTrGTOAARG CTGCCQGATC 30€0 

10 GGATCCTATC AAGATGAAGA AJ3GGCAACTT GAGTGCAAGC TTTGOCCCTC TGGGATGTAC 3120 

AjCSGGAATATA TCCATTCAAG AAACATCTCT GATTGTAAAG CTCAOTGTAA ACAAGGCACC 3180 

TACTCATACA CTGOACTTOA GACTTGTGAA TCGTGTCCAC TGOGCACTTA TCAGCCRAAA 3240 

TTTGOTTCXX: GGAGCTGCCT CTCGTGTCCA OAAAACACCT CAACTOTOAA ARGAGGftOCC 33 OO 

GTGAACATTT CTQCATOTOG AlGTTCCTTGT CCAGAAGGAA AATTCTOGOG TTCTGGGTTA 33 60 

IS AXGCCCTGTC ACCCATGTOC TOGTGACTAT lACCAACCTA ATGCAOGCSAA GGCCTTCTGC 3420 

CTGGOCTTGTC CX:TTTTATGG AACXACCCCA TTCX3CrGGTr CCAGATCCAT CACAGAATQT 3480 

TCAACTTCAG rrCTGAATAT TACTATTTTC GGTGOATTTG GGCATCIGGA 6TTGITAAA3: 3540 

TOTCCTTCTG AGGTTTTCCA TGATiTGCTTC TTTAACOCTTT GCCACAATAG TGGRACCTGC 3fi00 

CAGCAACTXG G6CGTQGTTA TGTTTQTCTC TQIOCACTTG GATATACAGG CTTAAAGTGr 3fi60 

20 6AAACAGACA TCSOATQAOTG CAGCGCACTG CCTTGGCTCA ACAATGGAGT TTGTAAAGAC 3720 

CTAGTTGOGG AATTCATTTG TGAGTGCCCa TCAGGTTAIZA CAGQTCAQCG GTOTGA&SAA 3760 

AATATAAATQ AGTOTAGCTC CfifiTCCTTGT TTAAATAAAG GRATCTGTGT TGATGGTGTG 3840 

GCTGGCTKTC 6TTGCACATG TGTGAAA0GA TTTGTAGGCC TGCAXTGTGA AACAGAAGTC 3900 

AA-TOAATGCC AGTCAAACCC ATGCTTAAAT AATGCAGTCT GTGAAGACCA GOTTOGGOGA 3960 

Z5 TTCTTGTGCA AATOC3CC31CX: TGQATTTTTG GGTACCOGAX 6TGGAAAGAA CGT0G&T6A6 4020 

TGTCTCAiGTC AGOCATSCIU^ AA&TQ6IA0CT aCCXOTAAAG AGGCSSGCCAA TAOCTTCAGA 40 BO 

TGCCTGTGTG (T^aCIGGCTT CACOGGATCA CACTGTGAAT T6AACATCAA TGAAT6TCAG 4140 

irCTAATCCAT GTAGAAATCA GGCCACCTOT Ga:GGAXGAAT TAAAITTCATA CAGTTGTAAA 4200 

TQTCAOCCftiG QATTTTCAGG CAAAA6GTGT GAAACAGAAC AGTCTACAQG CTTTAAC3CTG 4260 

30 GATTTTGAAG TTTCTOGCAT CTATGGATAT GTCATGCTAS ATOGCATGCT CCCATCTCTC 4320 

CATGCTCTAA CCTGTACCTT CTG8ATGAAA TQCTCTGAGG ACATGAACTA TGQAACAC3C3^ 4380 

ATCrOCTATG CAGTTGATAA GGGCAfiCGAC AATACCTTGC TCCTGhCTGA TTATAAC3GGC 4440 

TQQCTrTCTTT ATCSltSAATGG CAGGGAAAAS ATAACAAACT GTCCCTC5GGT OAATGATGGC 4500 

AiGATGGCATC ATATTOCAAT CACTTOaaCA AGTGOCRATG GCATCTGGAA AGTCTATATC 4S60 

35 GATaOGAAAT TA!i:CrGAOG6 TOGTGCTGGC CXCTC7GTTG GT7TGCCCAT AQCrOGTGGT 4620 

GGTGOGTTAG TTCTOGGQCA AOA^SCAAGAC AAAAAAGGAG AlSGGftTTCAG COCAGCTGAG 4680 

TCTTTTGTGG GCTCCATAAG CCAGCTCAAC CTCTGGGACT ATGTCCTQTC TCGS^CAOCAG 4740 

STGAAGTCAC TGGCIACCFC CTQCCCRGRlS GAACf CASTA AAGGAAACGT GTTAGCATGG 4600 

CCTGATTTCr TGXCAGGAAT TGTGGGGAAA GTGAAGATOG ATTCTAASAO CATATTTT<?r 4860 

40 TCXGXriGCC CAOGCTTAOQ AGGGTCA(3IG CCTCATCTGA GAACIGGATC IGAAGAITTA 4920 

AAGCXnaGTT CKAAAfiXCftA TCIGTTCIGT GATCCAGGCT TCCAGCTGOT CGGGAAOCCT 49^0 

GTGCAGTftCT GTCTGAATCZA AGGAinAGKCOG ACACAACCAC TTCCTCACT6 TGAACX3CATT 5040 

AQCTGTGOGG TCCCACCTCC TTTGGAGAAT GGCTTCCATT CAGCC3GATGA CrTTCTATGCr 510D 

GGCAGCACAG TAACCTACCA OTQCRACAAT GGCTACTAITC TATTGGGTGA CTCAAGGAT6 5160 

45 TTCTGTACAG ATAATGGSAG CXGGAAlGGGC GTTTCAOCAT CCTGCCTTGA T6TCGATGA6 5220 

TGTGC3W8TTG GJITCftSATTO TAOTGAGCAT GCITCTTGCC VGAAaSTAGA TGOftXCCTAC SSBO 

ATATIGITCAT (3T6TCCCACC 6TACACAGGA GATGOQAAAA ACTOIGCAGA AOCTATAAAA 5340 

TGTAAGGCTC CAGGAAATCC GGAAAATGOC CACTCXTTCAG GIGAGATTTA TACAGXAGGT S400 

GCOGGAGTCA CATTTTC6TG TCAGGRAGGA TACCAGTTGA TGGGAOTAAC (MAATCACA 54G0 

50 TGTTTaOAGT CTGOAGAATG QAATCATCXA. AXAGCAXATT GTAAAGCIGT TTCATGTGGT 5520 

JUVACCGSCIA TTOCWSAAAA TGCTTGCATT <»0GAGTT»0 CATTTZUCTTT TGOCAGCAAA 5500 

GTQACATATA GGTGTAATAA AGGATATACF CTGGCOGGTS ATAAAGAATC AITCCTGTCTT 5640 

GCIAACACTT C^nGGAGTCA TTCCOCTCCT GTGTkjxwAAC CAOTGAAiGTG TnCTAGTCCG 570D 

GAAAATATAA. AOAATOGAAA ATAXA7TITG AGTGGGCTTA OCTACCnTC TACTGCATCA 5760 

55 TATTCATGOG ATAI3A13GATA CAGCTTACSU3 0QCC3CTTCCA TEATXGAATG GAGSGCTTCT 5B2D 

GQCASCIG08 AC2^GAGGGOC AGCIGCGTGT CSUTCTOGTCT TCXGTQQJU3A AOCW3CTGCC 58B0 

ATGAAAGATG CTGTCATTAC GGGGAATAAC TTCACITTCA GGAACAOOST CACTTACSU^ 5940 

TOCAAA^3AAG GCVATACTCrr T6CT66TCTT GACACCATTG AATGOCTGGC CGACGGCAAG 6000 

TGGAGTAGAA GTGACCRGCA GTGCCTOGCT GTClCCTGOrG AIGAGGCAOC GATTG3:G<>AC 6060 

60 CACeCCTCTC CASnSZ^CTGC CCAXGSGCTC TTTGOAOACA TTGCATTCXA CTACTGCTCT 6120 

GA7X9GTTACA QCCTSUSCAfiA CftATTCCCAQ CXTCICirGCA ATGOCCAGG6 CAAGIGOGTA 6180 

OCCCCAGAAG GTCAAGACAT GCOOCOTTGT ATAGCTCXTT TCTGTGAAAA AOCXC3CATCG 6240 

GTITCCTATA GCATCtTGGA ATGTGTGAGC AAAGCAAAAT TTGCAGCTGG CTCAG7TGTG 6300 

AGCTTTAAAT GCATGOAAGa CTTTGTACTG AACMCTCftO CAAASATTGA ATGTATGAGA 6360 

65 GGTGGGCAGT 06AACGCTTC CCCX3VIGTGC ATOCftSTGCA TOCCTOTGOS OXOTGGAISAa 6420 

CCACCAAGCA TCATGAATG9 CTATGCAAG:? GGATCS^AACX ACAGTTTIQG AGGCATGGT6 6480 

GCTZACAGCT GCS^ACAASGG GTTCIACATC AAA0GGOAAA AGAAI3AGC3U: CTGGGAAGCC 6540 

AC7VGGGCAGT GGAGTAGTCC: TATAOOSAOG TGCCACCGGG TATCTTGTGG TGAACCACGT 6600 

AAGGTIGAGA ATGGCTTTCT GGAGCATACA ACTGGCAGOV TCTTTGAGAG XOaAaTGAOG 6660 

70 TATCAGTGTA AOOOOGGCXA TAAGTCAGTC GGAAdGXCCTG TATXTGXCra CGAAGCCAAT 6720 

CGCCACTOGC ACAGTGAATC CCCTCTGATG ^CGTCSTTGCrC TOGACIGIGO AAAAOCTC3CC 6780 

COGATOCA0A ATGGCTTCAT GAAIU3GAGAA AACIXTQAAO TAGGGTCCAA GGTTCAGTTT 6B40 

TTCTGTAArPS AGGGTTATGA GCTTGTTGGT GAC3^TTCTT GGACATGTCA GAAAXCTGGC 6900 

AAATGGAATA AQAAGTC3UU^ TCCAAAGTGC ATGCCTGCCA AC3TGC3CCAGA GGOGCCCXTlC 6960 

75 TTQGAAAACC AGCTAGTATT AAA6GAGTTG ACCACCXSAGG TA0GA6TTGT OACATTTTOC 7020 

TGTAAAGAAf3 GGCATOTGCT GGAaCSSOOCC TCXGXCCTGA AATGCTEGCC ATCCCAGCAA 7080 

l-GGAATGACT CTTTOCCTGT TTGTBABATT GTTCTTTGTA COOCACCTCC C3CTAATTTOC 7140 

TTTGGTQTCe OCATTCCITC TTCTGCTCTT CATTTTGGAA GTACrGTCAA GTATXCIXGT 7200 

GTAGGTGGOT TTTICCTAAS AGGAAATTCT ACCACOCTCT GCCAACCTGA TGQCAOCTOG 7260 

oO AGCXCTOCAC TGCSCA6AATX3 T6TTCCAGTA GAATOTCXXC AACCTGAGGA AATOCCCAAT 7320 

OGAATCATTG ATGTGCAAG6 OCIXGCCTAT CTCAGCACf^ CTCICTATAC ClGCAAlGCCA 7380 

G6CTTTGAAT TGeiaOOAAA TACIACCACC CTTTOTGOAG AAAATGGTCA CTOGCTTCQA 7440 

G6AAAACCAA CKTOTAAAIGC CATTGASTac CTGAAACOCA AGQAQATTTr GAKIGGCAAA 7500 

TTCTCTTACA C3GGAOCTACA CTATGGACAG ACOGTTACCT ACTCTTGCAA CCGAGGCTT^r 7560 
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OGGCTCGAAO GTCCO\GlGC CZTGACCT6T TTAGAGAOVG 6TGATTGQQA TGTAf3ATGCC 7620 

CCATCTTGCA ATOCCRTCCK CTQTGATTCX: COW^ACCCA TTGAAAATOG TTTTGTAGAA 7680 

GSTGCRGATT ACAGCTATGG TGCGATAATC ATCTACPJSTT GCTTCCCTGG GTTTCAISGTG 7740 

GCTGGTCATG CCRTGCAGAC CTGTGAAGAG TCAGQATGOT CAAGTTCX:RT CCCAACATGT 7800 

5 ATG0CAA.T!Al9 ACTGTGGCCT CCC7CX:TCAT ATAGATTTTG GAGACTGTAC TAAACTCAAA 7BG0 

GArtGACCAOG GATATTTTOA QCAAGAAGAC GACATGATG6 AA6TTOCATA TGTGACTCXn* 7930 

CACCCTCCTT ATCATTTGGG AQCAGTGGCX AAAACCTGGG AAAATACAAA GOAGTCTCCT 79 BO 

GCTACACATT CATCAAACTT TCTGTATGGT ACCRTQGTTT CATACACCTG TAATCCAGGA 8040 

TATGAACTTC TGGGGAACCC TGTGCTGATC TGCCAOOAAQ ATOGAACTTG GAATGGCAGT 8100 

10 QCAOCATCCT GCATTTCAAT TQAATCSTOAC TTGCCTACTG CTCCTGAAAA TGGCTTTTTG 8160 

OSTTTTACAG AGACTAGCAT GGGAASTGCT GTGCAGTATA GCTGTAAACC TGGACACATT B220 

CTAGCAQQCT CTOACTTAAG GCTTTQTCTA GAGAArAOAA AGIGGAGTGG T6CCTOCCCA 8280 

CGCT6TGARS CCATTTCATG CAAAAAGOCA AATCCRGTCA TGAATGGATC CATCAAAGGA 3340 

AQCAACTACA CATACCTOAB C2R.CaTTGTAC TATGAGTGTG AC0CX3GGATA TGTGCTGAAT B40O 

15 GGCACTGAGA GGAGAACATG CCACGATGAC AAAAACTGOO ATGAGGATGA GCCX3VTTTGC 8460 

ATTCCTGTGG ACTGCAGTTC ACCCCCmSrC TCAGCCAATG GCCAGGTGAG AGGAGAOSAfi 8520 

TACACATTGC AAAAAGAGAT TGAATACACT TQCAATGAAG GaTTCITGCT TGAGGGAOCC 8580 

AOOAGTCGGG TTTOTCITGC CAATGGAAGT TGGAGTGGAG CCAC^CCCGA CTGrrGIGOCT 8640 

GTCAGAT6TG CCACCCOGCC ACAACTGGCC AATGGQGTQA C5QGAAGGC5CT GQACTATOTC 8700 

ZO TTCAfGAAOO AAGTAACATT CCACTOTCAC GAGGQCTACA TCITGG^CGG TGCTCCAAAA 8760 

CTCAGCTGTC AGTCftGATGG CAACIG06AT GCAGAGATTC CTCTCTCTAA ACCA G TCAAC 8820 

l^TGGACCTC CTGAAiOATCr TGCCCATGGT TTCCCTAATG GrVSTETTCCTT TATICATGGG 8BB0 

OOCCATATAC AGTATCAGTG CTTTCCTC?9T TATAAGCTCC AlXa^AAATTC ATCAASAAOG 8940 

TGCXZTCTCJCA ATGGCTrCCTG GAGTGQCRGC TCAOCTTCCT GCCTGCCTTG CAGATGTTCC 9000 

25 ACACCAGTAA TTGAATATGQ AACTQTCAAT GaGACAGATT TTQACTGTGG AAAGGCAQGC 9060 

CGGATTCAGI GCTTCAAAGG CTTGAAGCTC CTAGGACTTT CTGAAATCAC CTGTGAAGOC 9120 

GATGGCCAOr GOAGCTCTGG OTTC3GCX2CAC TGTOAACACA CTTCTTGTGG TTCTCTTCCA 51B0 

ATGATACCAA ATGOGTTCAT CASTSAGACC AGCTC^TOGA AQGAAAATQT GATAACTTAC 9240 

AGCrOCAOGT CTQGATATGT CAIACAAGGC AGTTCAGATC TGATTTGTAC AGAGAAAGGG 9300 

30 6TATGGAG0C AGCCTTATCC AGTCTGTQAa CJCCITOTCCr GTGQOTCCCC AC!CaTCTGTC 9^60 

GCCAATGCAG TGGCAAC3X3G AGAGGCACAC ACCTATGAAA GTGAAGIGAA ACTCAGATGT 9420 

CTGGAA0GTT ATAOOATGOA TACAGATACA 6AIACATTCA CCIGTCAQAA AGATGGTCGC 9480 

TQGTTCCCrQ AGAQAATCTC CTGCAGTOCT AAAAAATGTC CTCICCOGGA AAT^TAACA 9S40 

^ CATATACTTG TACATGGGGA CGATTTCAGT GTOAATAGOC AAGTTTCTGT GTC3VT(3TGCA 9600 

35 GAAGGGTATA CSCTTTGAGGG AGTXAACATA TCAGTATGTC AGCII6ATG6 AACCTGGGAG 9660 

CC3bCCftTTCr COGATGAATC TTGCAGTCCA GT^CTTGTS GQAAAOCTGA AAQTCCAfJAA 9720 

CS^TQGATTIG TGGTTGGCAG TAAATACAOC TTTGAAAGCA CAATTATTTA TCAGTGTGAG 9780 

CCTGGCTATG A7«rEAGAGGG GAACRGGQAA COTGTCTGCC AOGAGAACAG ACAlGTOGAGT 9840 

GQAGOOQTtXS CaiAaaUCGCAA AGAGACCAGG TGTGAAACTC CACTTGAATT TCTCAATOSG 9900 

40 AAAGCTGACA TT6AAAACA6 GACGACIGGA CCXIAACSIGQ TATATTCCTG CAACAGAGac 9960 ' 

TACAGTCTIG AAGOQOCATC TGftg GCACAC T6CACAGAAA ATGGAAGCI6 GAGCCACCCA 10020 

GTCCCXCTCT GCAAACCAAA TOC3AT3CCC7r QTTGCTTTTG TGATOOCCGA GAATGCTCTQ 10080 

CTGTCTSAJA ASGAfiTTrxA TGrTGATGAG AATGTGTCCA TCAAATGTAG GGAAGGTTTT 1014.0 

CTGCTGCAQG GCCACQGCAT CATTACCTGC AACCCCGACG AGACGTGGAC ACAGACAAGC 10200 

45 GCCAAAIGTG AAAAAATCTC A1X3IGGTCCA CCAGCTrCACQ iTAGAAAAOtSC AATTGCTGGA 10260 

GGOGTACKTT ATGAATATGG ASAGATGATC ACCTACTCAT G^nACAGilGG ATACATGTT9 X032Q 

(3ZU30GTTTCC TQAGSAOTOT TTGrTTAGAA AATGGAACAT GGACATCAOC TCXTTAmTGC 10380 

AGAGCTGTCr GTOGATTTCC ATOTCAGAAT GGQGGCATCT GCCAACOCCC AAATGCTTOT 10440 

„ TCCTGTCCAG AGGGCtGGAT OGGGCGCCTHC TGTGAftGAAC CAATCEGCAT TCTTCCCTGT lOSOO 

50 CrGAAOCSSM? GTOOCTCTGT GGCC50CTTAC CftGTGTGACT GOCXGCCTOG CTGGAOaGOG 3.0560 

TCIOGCIGIC AT2U!M3CX63r TTGGCAGICT COCIGCTTAA ATGGTGGAAA A3X?IGTAAGA 10620 

CCAAACSOQAT QTCACTQTCT TTCnE^CTTGG AC)QQSA£!ATA ACTQTTQCAG GTAA 10674 



55 
60 
65 



Sei^ ID HDc C135 DNA Sequence 

nucleic Acid Accesslozi #; PGBiraSH predicted 

fSnrting sequence: 1..390 

1 11 21 31 41 51 

I 1 I I I I 

ATGAGGTTCA GTGTCTCA(3G CAlTGAGGACC GACTACCCCA GGASTGTGGT GGCTCGTGCT 60 

TATdlGXdUJ TCXOTCTOCT CCTCTTOTOT GCSUUSGGAAO TCATOQCTCC aaCZaGCTCSL 120 

GAAGCATG6C TGTGCC&GOC GGCACCCAGG TfflGSAGACA AGKTCTACAA CCCCTTOGAG 180 

CAGTGCTGTT ACAATGAOGC CATOGTGTCC CTC3AGGGAGA CCXiSGGCAATG 1GGTOCC0CC 240 

TGCAOCTTCT GGCOCTGCTT TGAGCTCTGC TOTCTTGATT CCTTCTGOCCr CAOyVAOTAT 300 

TETGfCV&eO^ AGCXGAAGGT TCIAGOGTGTG AArTTCCCAGT GCCACTCATC TCCCATCICG 360 

AGTAAATGTG AAAGAGG005 SATATGTXAO 390 



Seq ID HO: C136 UNA. Sequence 
70 SVUdelc Acid Accesalon BCQ 3 5671.1 
Ooddzig Beqaences a36..1745 

1 11 21 31 41 51 

^^1 I I II I 

/D GGCAGOQACT GCX3CCCGGTC OCGGOGCGGC GCTOGTCCBC AGAGOAfSaOG GCCCOGCCOG 60 

GGCAGCTGG6 GCTOGQGATC O&TCGaOGGG ASGOGGAGCT TGCCAAGCTG GOGGCCAGOG 120 

GGGTCATGOT QICCCX3GCGCC OGCGGOGGOG QGaCACTGGC GOBGGCIGCC GGGCGOSGCC 180 

TCCTGGCTTT GCTGCTCGCG GTCTOOGCCC CGCTOOGGCT GCAGGOGOaG GAGCTGOaTG 240 

ATGGCTGTGQ ACACCTAGTQ ACXTATCAGG ATnGTGGGAC AATGACATCT AAGAATTATC 300 

oO OCGGGACCTA OCCCAATCAC ACTOTTTGOG AAAAGACAAT TACAGTACCA AAGGGQAAAA 360 

GACTGAXTCT GM3GTTGOGA GATTTGGATA tTOGAASTCGCA GAOCTGT S CT TCTGACTA7C 420 

TTCTCTTCAC CAOCXCITCA GATCAATATG GTOCATACTG TGGAASIKTG ACIGTTCSCCA 480 

AAGAACTCtS GTTGAACACA AOTOAAGTAA CGGTCGGCTr TGnSAGTGGA TCCX9VCATTT 540 

CIQGC06G6a TTTTTIXSCIG ACCTATGGGA GCAGGOACCA TCCAGATTTA ATAACATGTT 600 
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10 
15 
20 
25 
30 



35 
40 
45 
50 
55 



65 
70 
75 
80 



TGGAACOAGC TAfiCCATTAT TI6AAGACAG AATACAGCAA. ATTCTGCOCA GCT66TTGTA 660 

GAGAOGTAOC AOGAGACATr TCTGGGMOA TG6TAGA,TGG ATATASAGAT ACCTCTTTAT 720 

•TGraCSVAAGC TGCCATCCAT OCAGGAATAA TTGCIGATGA ACTAGGTGGC CAGATCAGTG 780 

TGCTTCAGCO CRAAtSGGATC AGTCGATATG AAGOGATTCT QQCCAATGOT GTTCTTrCOA B40 

GGGATGGirXC CCTGTCAGAC AAGCGATTTC TOTTTACCTC CAATGGTTGC AGO^GATCCT 900 

TOnCTTTTGA ACCTGACGG6 CAAATCAGnS CTTCTTCCTC ATQOC«3TGG exCftATGAOA 960 

GTQGAGACCA AGTTCACrOQ TCTCXTTGGCC AAGCCCSACT TCASSACCAA GGGCCATCAT 1020 

GGGCTTCGGG C6ACAGTAGC AACftACCACA AACCACQAOA aTOGCTCSQAQ ATCt3ATTTG{3 IDBO 

GGOAGAAAAA GAAAATAACA GGAATTAGGA CCACAGGATC TACACAGTOG AACXrCAACT 1140 

TTTATGTTAA OAGTTTTGTG ATGAACXTCA AAAACAATAA TTCTAAQTOQ AAQACCTATA 12 OO 

AAGOAATTOT QAATA2UTGAA GAAAACGTGT TTCAGG6TAA CTCTAACirr COGGACCGAG 12fiO 

TGCAAAACAA TTTCATGOCT CC3CATCGTGG CCAGATATGT GCGGGTTGTC CCCCAGACAT 1320 

GGCACCAGAG QATAQCCTTG MGGTGGAGC TCATTGGTTG OCAGATTACA CPM3GTI\Xm 1380 

ATTCATTOGT GTGGOGCAAG AGAAGl'CAAA GCACCAQTOT TTCAACTAAG AAAGAAGATG 1440 

AGACAATCAC AAQGCCCATC CCCTOGGAAG AAACATOCAC AGGAATAAAC ATTACAACGG 1500 

TQGCTATTCC ATTG6TGCTC CTEGXTIGTCC TGGTerTTGC TOOAATOGGa ATCTTTGCA6 1560 

CCTrCAlGA2U!k GAAlSAAGAAQ AAAGOAAGTC 06TATGGATC AGCAGAGGCT CAGAAAACAG 1620 

ACTOTTGGAA 6CASATTAAA TATCCCTTTG CCAGACATCA GTCAGCTGAQ TTTACCATCA 1680 

GCTATGATAA TGAGAAGGAG ATGAC31CAAA AGTTAGATCT CATCACARGT GATATGGCAS 1740 

GTTAACTCCG TTGACT6GCA AAATAGCATC CCCAACGTGC AGCdTTOCGC ATCTATCAGC 1800 

ASGITGCCOC GaATOQATCT CAaAOATOAG GATCOGAACA CCATGTTCTT OTOCCACCICrA IfltiO 

ACAACAACAA AGSGCAGTAA ATTAAAGIAC TCTTIGTAAQ GTACAGTTAC G^TTAATCT 1920 

AGAGATAAAA TATTTTCTTA AAAATATATT TGATTAAACA OCTATGCIGT CTCTATAAAA 1980 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2010 

5eq ZD HQ I C137 ISEIA gequence 

miclelc Add Accession #s Eos fiequence 

Codiag sequences 1..1761 



1 11 21 31 41 51 

I I I. I I ) 

ATOGGATTOG 6TGCX3GG6CA GC3GACIGGGC CCCQTCCCSSG OGCCGOGCTC GTCCXX3U3AG 60 

GAQGCGGCCC GGCCCGGGCA GCTGCGGCTC GGGATCCGTC GAQQGOAGGC CGAGCTTGCC 120 

AAGCTGGCX3C CCAGCGGGGT CATOQTGCCC GGOQCCOGCG GCGGCOGCGC ACTGGOGOGG 180 

GCTGCOGGGC GGOGGCTCCT GGCTTTGCT6 CTOSCGGTCX CCQCCCCQCT CGC3QCTGCA(? 24Q 

GOGGAGGAGC ^tGGBlGKEGQ CL&SQQKCkC cnAGTOACTT ATCAGGAT9G TGGCACAATG 300 

ACATCTAA0A ATTATCOOGG GACCIACOCC AATCACaUCl^G ITTGCGAAAA QACAATTACA 360 

6TAOCAAAGG OGAAAAGACX GATTCIGAGQ TTGGGAGATT TGGATATOGA ATCCCAGACC 420 

TOTOCXTCTO ACTATCTTCT CTTCACCAGC TCTTCAGATC AATATGGTCC ATACTGrOGA 480 

AGTAIGAGIG TTCOCAAAGA ACTCTTQITB AACACAAGT6 AAGTAAOGGT COGGTTTGAS 540 

AGTOOATCCC ACAXTTCTGG OCGGG6TTTT TUGCXGACXn: ATGOBAGCAS C3SACX31TCXA. 600 

GATTTAATAA CATGXTIGGA ACXSACSCTASC CATTATTTGA AGACAGAATA GAGCAAATIC 660 

Ta<X:CAGCrG GTTOrAOAGA OGTAGCAOGA GACATlrTCTQ GGAATATGGT AGATGGATAT 720 

AGAGATACCT CTTTA^tTGrG CAAAGCTQCC ATCCATGCAG GAATAATTGC TGATGAACTA 780 

GGTGGCCAGA TCAGIGTGCC TCAGOGCAAA OGGATCAGTC GATATGAAGG GATTCTGGCC 840 

AATG6XGTTC ^CTTCaAGQQA TQOTTCCSCXQ TCASACAAGC GATTTCXGTX TACCXGCAAT 900 

GGTTQCACiCA OATCCTTGAG TTTTGAAfXlX GAmSGCAAA TCAGAGCTTC TTCCTCATGG 960 

CMSTOGGTGA ATGAGAGTGG AGACCAAGTT CACTOGTCTC CTGGOCAAGC GOGACITCAG 1020 

GACCAAGGCC CATCATOGGC TTCGGGOGAC AGTASCAACA ACCSICAAACC ACQAGAGTGG 10 SO 

CTOGAGATOG AITTGGGGGA GAAAAAGAAA ATAACAGGAA TTAGGAOCAC AGGATCTACA 1140 

CAGG?C3GAACT TCaACTTTXA TSTXAAGAGT TTTSimCGA ACTTCAAAAA CAATAATTCT 1200 

AAGTGGAAGA GCIATAAAGG AATIIG3'aAAT AATGAB0AAA AGSTGTTTCA GGGTAAGTCT 1260 

AACTTTOGGG nCCCASmCR. AAACAATTTC ATCOCTCCCA TCGTGGCCAG ATATGTGOQO 1320 

QTTGTCCCCC AGACA1GGCA OCAGAOGATA GCCTTQAAGa TGGAQCTCAT TOGTT6CCAG 1380 

ATTACaCAAG GTAATGaTTC ATTGGTGTGG OQCRAGACAA GTCAAAGCAC CAGIGTTTCA 1440 

ACCAAGAAAB AAGATGAGAC AATCACAAGG CCCM!OCCCT CXaOAAGAAAC ATCm»GGIA 1500 

ATAAACArCA CATbCaOTGGC TATTOCATTG GTGCT0CTT6 TTGTOCXGSl: GTTTGCTGGA 1560 

ATGGGGATCT TTOCAGCCTT TAGAAAGAAG AA8AAGAAAO BAAGTCaTTA TGGATCAGCA 1620 

GAGGCrCRGA AAACAGACTG TTOGAAGCAO ATTAAATATC CCTTTGOCAG ACATCSWSTCA 1680 

DU GCXGAGTTTA GCSTCAGCTA TGATAAIGAG AAGGAGAatSA CAi:!AAAAGTT AGATCTCATC 1740 

ACAAOTGAXA TGGGAGGTTA A 1761 



£eq ID HQs C138 DNA Sequence 

KUclelc Acid Acceasion i^i FOBBIBSS predicted 

O^kdlAg sequence: 1.-2310 

1 11 21 31 41 51 

1 I I t I 1 

ATGTTCCftOC OGCAGQAAAG ATTTCTTGAC TTATCTTCAO ClGAAGCftOT GaCAfiCITGG 60 

ATATXACATC AACATCCTGA CATTATTAAC AAAGGTGATG GCTGXGGACA CXrrAGTGACI 120 

XATCAGGATA GTOGCACAAT GACATCXAAG AAXXATCCOS GQACCKAOOC CAATCACACT ISO 

GTTT6GSAAA AGACAATTAC AGZACXaUVAlG GGGAAAAGAC TGAITTCIGAG GTIGGGAGAT 240 

rrrGOATATCG aatcccagac ctotccttct gactatctxx: tcttcaccag ctcttcaqat 300 

CAATATGGAA TGCAQAAGGA GGAGGAGACA GAAGTGCTTT GTCTTTCAGT GGCTGGOGCT 360 

CAGAGAGTGG ACATTOCTGT GCAOCTOTTG CXX^AGCIICC TGGAAGGGTG GAAGOGTCAT 420 

GCTGAIGCAA GAGGTCCATA CTGIGGAAST ATGACTGTTC CCAAAGAACT CTTGTTGAAC 4B0 

ACAAfiTGAAG TAACCOTOCQ CTTTGAGAGT G6ATC0CACA TTTCTGGCCSG GGGTTTTITG 540 

CTGACCTATG GGAGCAGOGA CO^TCCAGAT TTAATAACAT GTTTGGAAlOG AGCTAGCCAT 600 

TATTTGAAGA CAGAATACAG CAAATTCTGC CCAGCTGGTT GTAGASACGT AGCAGGAGAC 660 

ATTTCTG6GA ATATOGTAGA TGGAarATAGA GAXACXIICTT TATTOTGCSU^ AGCIGCXZATC 720 

CATGCAGQAA TAATTGCTGA TGAACTAGGT GGCCAiOATCA GTGTGCTTCA GCGGAAAOGQ 780 

ATCAGTGGAT ATGAAGGGAT TCTGGOCAAT GGrGI^fCTlT CaaAaGBATOG T T COCTOtCA 840 

GACAAGCGAT TTCtGTTTAC CTCCAATOGT XQCAGCASAT OCTTGAGTTT TGAACCTGAC 900 

GGQCAAATCA GAGCTTCTTC CTCATGGCAG TCSGlCAATG ASAGTGGAGA CCAAGTTCAC 960 
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10 



15 



20 



25 



AGCAACAACC 
ACAGGAATTA 
GTQATGAACT 
GAAGAAAAG6 
CCTCCOVTCQ 
TTGAAGGIG6 
AAGAJCAAGTC 
ATCCCCTCGG 
CA6ATGATCT 
ACAGCOQAAA 
TTCTT2VGCAC 
CXICTTTGCCA 
ACACAAAAGT 
GGCAXrcaOGA 
GAGGCAG6G6 
ChCGav&TACQ 
CGGCAOGTGC 
CABCCCGGCC 
GCCCAGGACG 

CCAJU3GCATC 



GCCAAGGGCG 
ACAAACCACG 
GGACCACnOQ 
TCAAAAACAA 
TGTTTCauSGG 
TGaCCftOATA 
AGCTCATTG6 
AAAGCRCX3^ 
AASAAACATC 
CACAAAGGGA 
GCftTG&rXACS 
ACACACCTGA 
OflCATCAGTC 
TAGATCTCAT 
CAOTCACSGAa 
TGAGCAOOGA 
OGCTGCCXJCT 
TGOGOGCCCA 
ACftAACACTC 
GAGACTATCA 
CTGTCRfiCaC 
COGG<SACGAG 
CCATGACTGC 



acttcaggac 
agagtggctg 
atctacacag 
taattctaag 

TAACTCTAAC 
TGTGOGGGTT 
TTGCCaaATT 
TGTTTCAACT 
CACAGATGCC 
GAATCTG06A 
AGTAGTGnT 
GGKGGACZVTT 
AGCTGAOTTT 
CACAAGTGAT 
GAAGGGCTCC 
TGCOGGOGGC 
GGCGCCCCCG 
CACGTTCTCT 
CCTCTCCTCG 
AAGGCCACAC 
CCTCGCCACC 
TGACAGCPAT 
CCTTTTGTGA 



CAAG6CCCAT 
GAGATOSATT 
TCGAACTTCA 
TGOAAGACCT 

tttggggacc 
otccccx:aga 
acacaaggta 
aagaaagaag 
atgccaqtgc 

CCTGAT6AG6 



GATCACTACT 
ACCATOVGCT 
ArOGCAGATT 
ACCTTCCOGC 
CACTATGACT 
GAGGCCQAaT 
G0GCAGAGC6 
GGCGQCTTCT 
iVGOGCACAGC 
BAAAGGQSGC 
TCTGCXX:CCA 



catgggcttc 

TGGGG6AGAA 
ACTTTTATGr 
ATAAAGGAAT 
CAQTGCAAAA 
CATOGOVCCA 
ATGATTCATT 
ATGAGACAAT 
AGATTGTCGG 
GCAAAATACC 
TTAATGACCT 
GITGG2UVGCA 
ATOATAATQA 
ACCAGCAfiOC 
CCATGGACAC 
GGCOGCMSCe 
ACQCCACGCX: 
GCTACOGOGT 
CCCCCGTAQC 
CIG0QGAOU3 
ACCCTtSACXC 
GAGACXCOCT 



GOGOGACAGT 
AAAGAAAATA 
TaAGAGTTTT 
TGTOAATAAT 
CAAXTTCATC 
OAOGATAaCC 
GGT6TGGOGC 
CACAAGGOCC 
AGACCATACC 
TTTTAAAGGC 



GATIAAATAT 
GAAGGAGATG 
CCTCRTGATT 
GGATGCSCQAG 
GSGOGGOCGC 
CATOGTGOAS 
CCCAGGGCCC 
GOGTGTGQQC 
GGGCIAG6AC 

nxAOAAGccx; 

CACAOOCCTC 



1020 
lOBO 
1140 
1200 
1260 
13ZD 
1380 
1440 
ISOO 
X5€0 
1620 
1680 
1740 
1800 
1860 
1520 

isao 

2040 
2100 
2160 
2220 
2280 
233.0 



8eq ID 3SK>t C139 DHA Sequence 

Shicleic Add AccisaBion Hi 1IM_004616.2 

Cooling sequencs; 180.. 893 



30 



35 



40 



45 



50 



1 
J 

ASTGCCCXAQ 
ATTTAaATGT 
GACAAGCCTG 
IGGCAGGIGT 
TATOTOGTAT 
CAAXT!IXTGG 
CTOTWJGTGC 
GTOGCIOCAT 
CBACAGOTAT 
TCTATGAAAA 
CCAVAATTGT 

QCCAAAGCTA 
TCTTGGGAAA 
TACTGOGTTT 

AAATATGTAA 
AGACCACA6A 
AATGATATGA 
AAAAAAAAAA 



11 
I 

QAGCTATOAC 
OOGTGGATAC 
TAAC3GAATAG 
GAGXGCCTGT 
CTTGATCCTA 
TTCI6AAGAT 
CATCATCATG 
GCTTCTGXTG 
CCTAGOAOCr 
CACAAAGCTT 
QTTTCAA@AA 
TTTTCftACAC 
TAATGGAAAA 
AAATriGATT 
GGTGTTTICr 
CCXATG3TCA 
CIGCTATATA 
TATCTTCTAlG 
ATGTGTATTT 
AAAAAAAAA 



21 
I 

AAGCAAAGGA 
AGAAATCTCX 
TTAAATTCAC 
ATfAAAATATT 
GCATTAGCAA 
G7AGGCTCTA 
ATTCTGGGCT 
Vlirn'OVTAG 
OTTTTCAAAT 
TTGAGGGCCA 
GAGTTXAAAT 
VATCCItaAA^ 
CAAGTTTACA 
ATAfiTTATTG 
ATGGTOCT6T 
QTCAAAC3CCC 
AGTC!AOGAf3C 



TTACTCAAAA 



31 
I 

ACAIACTTSC 
GCftSGCAAOT 
GGCATCTGGA 
CTATQTTTAC 
TATGGGTA06 
GCTCCTAOST 
TOCTGGCTTG 
GCTK3CTTCT 
CTAAGTCTGA 
QVQGGOAAAQ 
G CTGOB GTTT 
fEATQTOCCTG 
AAGAGACCT6 
GAATATCATT 
ATTGCCAGA!i: 
TTTAMATGT 
AGCTGTCiVr 
CAChlTT&AG 
TAAAA6TAAC 



41 
I 

CTGGAGKTAG 
TGCTCCAGAa 
TTOCTAATCC 
CTTCAACrrTC 
A6TAAGCAAT 
TGCTQTGQAC 
CTGOGGTGCT 
OATCCTQCTC 
TCGCATTGTG 

GGTCAATGGA 
TCTAOATAAG 
TATTTCTTTC 
TGQACTGGCA 
GGGGAACAAA 
TGCTTTS3CT 
TIAAAIOGTC 



SI 

i 



CATATTQCAQ 
TTTTCOSAAA 

GACTCTCaU^ 
ATATXG3WITO 
ATAAAAGAAA 

AATGAAACTC 
TTCCA0QAAG 



ATAAAAGACT 
GTTATTQAGA 
TGAATCTGT6 

TC3GGCXAGCT 



T6TTT3U:GTT AAAAA2UWAA 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

loao 

1140 
1159 



55 



60 



65 



70 



75 



80 



Seq ID NOs C140 DHA Sequence 

nucleic AjCid Acceaoioa #i 1IK_004617.2 

Ooaing sequencoi 232,. 640 



1 

I 

CTTCAGOTCA 
AGCTGAAGCA 
ATTGAATTGG 
AGGAGGCAGT 
GGGGQCTQXG 
GCTAACATCJC 
GAAGAGATCT 
CTGGTGTTCT 
GGGAAGGG&T 
GQATACXdn* 
AATAGTACAT 
AACAAGTQCC 
CTGQTOGliAO 
GGSAGCCTCI 
AC9CTaC3GAfia 
CTTCTTCTCT 
AGGC3kGTrAT 
TAAAAAATAO 
TXAGAATTTA 
ATGTATAAAG 
TGCSAA6TTTT 
TTCTGTXTTT 
TTTTTTGCTG 
GGGTTCTTTT 



11 
I 

GGGAGAATGT 
ACIXXS^AGGA 
AOACAATTAC 
GASGAGCTTT 
CCAOATGCCT 
TGTTATTTTT 
GGTTTTTOSQ 
TGGGCCTGAA 
TTGCGATGTT 
^ATCATCTC 
GGG6CTACGG 
GA6RG0CTCT 
GAQGAATGCA 



tgagotgctc 

TCTTGGAATT 
TCCTTCTTTC 
TTTGGCCACT 
CCAACAQGIT 
TCACATGTAC 
ACATAAATCA 
AAGAGGAAAA 
CCTTQCTTGA 
ATGXATGTTA 



21 
I 

ATAAATGTOC 
CACAGITCAC 
AAQGACTCFC 
TGATTGCTGA 
GGaOGGaACC 
TOCTGGAGGA 
AC3GAATATTA 
GAACAATGAC 
CavCCIOCACG 
AQCCKTTTCA 
OTTCCAC3GAC 
CAATOTGQTT 
QATGGTTCTC 
CCAGIGTTOr 
AGACTCTACA 
ATTAATTCCT 
CAAOCAGCTT 
TAACAAAXTT 
CAAAGCATAC 
TGCCATAlCTA 
AAGGAAOAAA 
A6AATGATTG 
BTTGCITGTG 
AATTAAAAOC 



31 
I 

ATTGCCATOG 
AGAAATTTGO 
TGQ OCAA AAA 
CClCfCQfSX&T 
CTCATTCCCC 
AAAGTGAtAG 
(SQAAGCGGTG 
TGCTGTGGGT 
AXATTTGCTQ 
ATQkACftAGS 
GGGGATTATC 
CCCTGGAATC 
^CQOGCCATOC 
GGCTQCTQTO 
GCATGACGAC 
ATCTGCTTCC 
TaCTCGAGTT 
GATTTATAAA 

rr mxTGK T 

CTTCITTGTA 
GCftCATTTAA 
ATGrATCCTA 
ACTGATCTTT 
TQAATTCM3A 



41 
1 

AGGTTCTGCT 
TTCTCAGCXJC 
CCCTTGAAQA 
ACCAOCCCAfS 
TTGCTTTTTT 
ATXSACAAGGA 
TCrWATGAT 
GGTGGGGCAA 



GTCCIAAATG 
TCAATtSATOA 
TGACOCTCTT 
AGGTGGTCAA 
OGGGAQAOIGG 
TACAATTTCT 
TABCTGATAA 
AGAATTTTGT 
TCTTTCAAAI 
TTTTTTATTA 
TATAAAGJOG 
AATGAOAAAC 
AGTATTGTTA 
TGAGGCrGTC 
GCTAAOBT 



SI 

I , 

ATTTTTGAGA 
CAAAATACTG 
GGCCCOGTGA 
AATG«3CACI 
TOGCTTCCTG 
GCAOCTTTCC 
CTTOCCTQCJS 
OGAjSGGCTGT 
CTTOGGAGCX 
OCTCATGGOC 
GOCCmATGG 
CTCCATCCTG 
TQQCCTCCX6 
ACCGGTTTAA 
TTTCATAAAA 
AGCTTAOAAA 
TATTTTCAAA 
TAQnTOCTlT 
CAAATC3IAAA 
TTTATATCTT 
TAACACCAAT 

ATCATGGCIA 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1428 



Seq ID NO I C3.41 DHA sequ^Kse 

nucleic AcM Acceselon 4»£ in4_0023Bl,2 



1288 



wo 03/042661 



PCT/US02/36810 



coding sequence : 64 . . 1524 
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1 11 21 31 

lilt 
AAATCCDOAGC CTC6C6TGGG CXCCTrGGCCC COQAaSQACA 
ACCAT6CGGC GCCC3QGCCCC OSOGGQCOaC CTCCCGGGAC 
CMCTCCTGC TGCCCTCXX3C CGCCCCCGAC CCCGTQGCCX: 
GAGACCCGAG GTCCXlGGGGG CAGCCCTGGA CGCCGCCCCT 
GCOCCCGCTT CCGGGACCAO CQAGC!CTGGC CQCGCCCGCG 
AGACCCTTGG ACCTGGT61T TATCATTGAT AGXTCTCGTA 
ACCAAAGTGA AAACTTTTQT CTCCC3QQATA ATCBACACTC 
ACX3CQGGTGG CAfiTG(3TGAA CTATGCTAGC ACTGTGAAGA 
TACACAGATA AOCACSTCCCr GAAGCAGGCT QTOGQTCQAA 
ACCATGTCaS GCCTAGCXyVT CXZAGACAGCA ATGGACX?AAG 
GCICQAGAGC CCTCTTCTAA CATGCCTAAG QTQQCCATCA 
CAGGACCAGe IGAATGAAGT GGOGGCTGGG GGCCAAQGAT 
GTGO^CGTGG AGCGGGCAGA CATGGCOTCC CTCSUiOATGA 
GAGCATGTTT TCTACX3T6GA GACCTATGCG GTCATTGRCSA 
QAAACCTTCT aTGaSCTOGA GCCCTOTGTa CTTQGAACAC 
ATCAGrlTGATG GGGAA0GCAA GCACCACIGT GA6IGIAGCC 
GACAA6AAAA CQTOTICAGC TCTTGATAaa TGTC3CTCITA 
ATCIGTGTGA ATGACAGAAG TGGCTCTTAT CATTGTGAGr 
AATOAAOACA aaAAAACTTa TTCAQCTCAA OATAAATGTG 
CAGCACATTT GTGTGAATGA CAGAACAGGG TCCCATCATT 
ACTCTGAA'XG CAOATAAAAA AACATGiaiCA 6TCGGTGACA 
GGTTGOCRGC ACATTTSTGT GACTTGATCGQ QCCQCATCCT 
OGCTAGAOCr TAAATGAGSA CAAGAAAACA TSTTCAGbCA 
GTTTCCACTG AACJATGCTTG TGGATQIBAA GCTACACTG6 
TCGTATCTTC AAAGACTGAA CACTAAACTT GATGACRTTT 
OAAXATOGAC AAATACATCG ITAAA.TTGCT CCAATTTCTC 
TGGTGTACTT AATACTCATG GATTCTTTTG GAGACCTGTT 
TAA^TTGCCA TTATClTGTAT TAATGCTIIGA ATATXACTGS 
CTGCAGAATC AGCATGATTT TTCCAAGGAA ATACATATGC 
CTTXAQTQTC TCTAAOTTAT GACtGTGAAA TGATTOGTAG 
GTGTTTCTTT ATCTACTAAT TGAGCCATTT AATTTTTAAA 
ATTCACAATQ GAAACTTTAG GTCTAGITTC miGAKAGT 
TTATTACTGA GftGTGCAAAT TOTACAAGGT ATTTACACAT 
TOAATGTAAT TTTCAACTQT TTAACACITT TTGTXTTTTG 
TTGAAGATGr GATCAATAGA TTGTAATACA CATATCTAAA 
TOAACATTAC ATTGOCAXTT TEAATTCATT C^TGGTCTTTG 
AGGACTAGTT GTGftAITTAG GGTGTTAAAC TTTCTACCAA 
CITTATXATT TTGCTTO^GG ATCCAAG1?GA CAAAGl:f ATA 
ATOGACAAAA TCTAATGTTG TCTTTITAAT GTTAGTGATC 
AGTGCTGGQA TTACAGQCTT GAAA9TCTAA CTXTTTTTTA 
AATTCTTTTG GCTTTGAAAC TTGCAACITT GAGAACAAAA 
TQCTCAATTC TOTTTTTOGT TTGCATTOTC TTXAATAXAA 
lAITATCaTG TCIATTTTTG ATGACICATC AATTTIGrCT 
TTAAAAAAAA AAAAAAAAA 



Seq ID NOs C142 T/SK Sequtence 

KUGleic Acid AcceasioxL tts IIM^O 16639.1 

Coding aequences 40««42g 




GCCCGGGCTT 
CTCCTGCGGC 
QTGCAQGTGT 
GCGTAOGGOC 
TGGAaVTTQG 
TOGAGTTCCA 
TCACACCCTT 
CCTTCACAGT 
IIGTTACAQA 
CIGGTATT6A 
TGGC3C3\GIGA 
AACTTTCCTC 
ACCAGTGGCA 
AAGGATACAC 
ACACCCACGG 
GCTATGAAGG 
CTTTQGGTAC 
GTGAATGCTA 
AGIGTGGOCT 



CACGGAGCCC 
GCTCTGQCCQ 
G0GGAGGCT6 
TCCC3GACGGC 
TTGCAAGAGC 
GCTC3GAATTC 
GCCAGCGGAC 
ACTCC3W3GCC 
CTCAACAOGC 
OGAGGCAGC^ 
IGSGAGGCOC 
GCTCTATQCr 
GCCCCTAGAG 
TAGATTCCAG 
GCAGGTCTGC 
CTT6AAIGCC 
ATGTGAGC3^ 
TTATAOCTTG 
CCATGGGTGT 
TQAGGGCTAC 
AGGCTCTCAT 



CTGAGGAAGC 
C3VTTCCAC3GA 
TGGAGAAGTT 
ACCTGAAAAT 
ATTOCCAATa 
ATAAATTGTA 
AGATACTTAT 
GAAA7AGAAT 
TCTTTATATT 
AITCATAATA 
ACAACTTCAT 
CTTATTTTGT 
AATAGTTAAC 
AAAGAAATGT 
GTACAAAAAT 
TATITATAAA 
CACCTGCCTC 
CTIAIATATT 
CAGTOCTTTA 
TAAAAGTTAT 
ATTAAAGATA 



AOGAAGACTT 
CAAGGTCAGC 
GAAAATAAAT 
GTOGACAGCr 
TTCCTGCIAA 
TGAAGATCTT 
TAAGAGCAAA 
GAAAAGTTTA 
AGATAACXAT 
TAAATCAATC 
ATAACTGAGA 
TGGAGTATTA 
ACAOATCAAG 
ACTACTAAAG 
CCCAAATTCA 
ATTGCTATAA 
AGCCTCCCAA 
TGATACATAT 
AATTTTGCAC 
TACCTTXAd 
TTTCTTTAAA 



1 

\ 

GC33GOGQGGa 



GAGCAAGOGC 
AAGTGCA1X3G 
GCTGCAGGAC 
CTGACCTTOQ 
GAGAAGTTCA 
ATCCAQTGAC 
TTCTAGAGCC 



11 
I 

CAOACUGOGG 
GGCTOCTGGT 

ACXGOGOGTC 
CTOCTGCCCC 
TGCTGGGGCT 
OCAOCCCCAT 
AATQTGCCCC 
AGTCTCTGCC 



AGGTQTCTGG 
ACAAAACAGC 
CCTTCCTTAG 
TCACTCtUSAT 
TTAACACTAG 
CCCAAAGC3SG 
AATAAAAGAA 



TTGCCCTGOC 
TGACAiCTGAC 
GAOCTGGGGG 
QTOCTGAAAT 
OGGCXGGOCC 
QOAjGQAflATA 
TCTXTAACTT 



21 
I 

OGGGOO^USG 
GCTGGGGCTC 
OOCCrOCTCC 
TTGCAGGGOG 
CrfCOGGCXG 
GCTTTCTQGC 
AGAGGAGACC 
CTQOCAflCOQ 
TCCCAGAOGC 
ACCrCCTGAGS 
TCTGaCTGCA 
TAAGGAACTG 
OCAGGCTGAC 
TGCAGCavCGG 
ACfAlGSOAGGG 
TTTATTTXGS 



31 

! 

ACGTGCACTA 
TGGCTGGCGT 

GQAC06CACA 

citcxggccca 

TTTTTGOTCT 

GGOGGAGAGG 
GGOCTCdCOC 

ggogggagcc 




GAAOU3AAAO 
CAGCATTTGC 
TTGGGGQOCA 
OOOTCACCCT 
CTGGGOCTAA 
GGAQAOTTTG 
AAAAAAAA 



GQAGACQATQ 

GCTG0CX3W3C 
ACTCRTC3VTT 
AAGCTOCICC 
GGGmCAGGG 
GC3AGCCTCAC 
ACAGGGGAGG 
GACTTGACAC 
GGQGGQTTAG 



OGCTCTGABC 
OCGCAOGAOA 
TGTGGOGCTG 
CATTCATCCR 
AACCACAAGG 
GAACCnCCA 
GCTGGCTCAC 
OGGGTGCGCT 
TAGQCCCCAC 
GGAiCCTATtrr 
OCGCCAACTC 
AGAATrCATT 



60 

120 

180 

24D 

300 

360 

420 

480 

540 

GOO 

660 

720 

780 

840 

900 

960 

1020 

lOBO 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

leoo 

I860 
1920 
19 BO 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2560 
2S99 



60 
120 

leo 

240 
300 
360 
420 
480 
540 
600 
660 
720 
700 
840 
900 
960 
99& 



Seq m WO: C143 DNA Sequence 
snidelc Acid Acceseion ffs »H_001819 
Coding sequence! 113.. 2146 

1 11 21 31 41 51 

1 I I 1 I i 

CCAGOAGGCA GBCIGGTTTT CCGQGGCCXX: TCCSTCGCGC CTCCCTCCTG COCCTOGCTT 60 

CTCGGQTCCA GCOGGCATCT TCCTTTCOGC ACSUQGGGCOG COOAGOGGOG CCATGCAGGC 120 

AAOGCTCCTT CTCAGCCTCC TOOGAGOOGr GGGQCIGGCG GCTGTCAATT CCATGCCAGT 180 

GGATAACAGG AACCACAATG AAGGAATGGT GACTOGCTGC ATCATTGAG6 TCCTCXCAAA 240 



1289 



wo 03/042661 



PCTAJS02/36810 



TaCCTTGTCSS AAGTGCAGOG CTOCACCCAT GhCGGCTGAG TGCOGCCAAG TQCTGAAG7VC 300 

GAGrrAGAAAA GACOTCAAAfi ACAAAGAG^C AACT6AAAAT GAAAACACAA AGTTTGAAOT 360 

AftOATTGTTA AGAGACCCA<3 CTGATGOCTC GaAAGCX:CAC QAGTCXnCCA GCAGGGGAGA 420 

_ CSGCftGCSRQCC CCAGGGGAGG AGGACATCCA AGGCCCAACA AAGGCAGACA CAGAGAAATG 4 BO 

D GGCAQAOGGA 060GGGCACA GCGGAGAGCG AGCBSATOAB CCGCAGTQQA GCCTCTATCC 540 

CTGCGACAGC CAAGTCTCTG AAGAAOTGAA GA.CAGGOCAT TCTGAGAAGA GGCAGAGAGA 60D 

GGATQAGGA6 GAGSAOGAGG GAGAGAACTA TCAAAAAOGG GAGCOAGGGQ AAGATAGCA6 660 

TGAA6AGAAA CACCTTaAAG AOCCAOGAGA GACACAAAAG GCTTTICTCR ATQAAAGAAA 72 0 

GCAGGCTTCA GCTATARAAA &AG3Vt3GAGTT AGTGGCCAGA TOGGAAACAC ATGCTGCCGG 780 

10 GCATTCTCAG GAtaAAGACAC ATAGCCGAGA GAAGAGIAGC CAGGAGAGT6 GAGAGGAGGC 840 

AjQQQAGCCAG GAGAATCACC CCCAGGAGTC TAAAQGCCAA OOCOGAAGGC AGGAAGAAXC 900 

TG»3GAAG6T GAOGAAGATG CCACCTCTGA GGTGGACAAA CQACGCACQA GGCCXIAGACA 960 

CCACCAOGGG AGGAjSCAGGC CX^GAiCAGGTC CTCTCAAGGA GGGA6TCTTC CXTTClGAGGA 1020 

AAAGGGACRC CCCCAQGAGG AATCTGAGGA GTCAAAOGTC AGC31TGQCCA GTTTAGGGGA 1080 

15 AAAGAGGGAC CACCATTCAA CCCACTACAG GGCTTCAGAG GAAGAACCTG AAXATGGAGA 1140 

AGAAATAAAG GGTTATCCAG GGQ1KX3U3GC GCCTGAGGAC ClGGAGrGGQ AQCCSCTATAG 1300 

GGGCAGAGGA AGTGAAG2^T ACAGGGCTCC AAGACCICAG AGTGAGGAGA GTTGGSATGA 1260 

GGAGQACAAG AGAAACTACC C!CAGCTTAGA GCTTGATAAG ATXSGCACATG QATATGGTGA 1320 

AGAAAGTGAG GAAGAGAGGG GCCTTGAGCC GGGAAAQGQA CGOCATCACA GAGGCAGGGG 1380 

20 AGOQQAaCCA CQTGCCTATT TCATGTCTGA CACaiGAOnA GAGAAAAGGT TCTTGG6TGA 1440 

AGGACACCAC OGTGTCCi^AG AAAACCAGAT GGACSUtiGGCA AOGAGGCATC CACAJVSQTGC ISOO 

GTGGAAAGA6 CTGOAOUaAA ATTATCTCAA CTACGGTGAG GAAGGAGOOC CAGGGAAGTG 1B60 

GCAGCAGCAG GGAGACCTGC AGGACACTAA AGAAAACAGG GAGGAAGCTA GGTTTGAAGA 1620 

^ - TAAACAATAT AGCTOCCATC ACAC3«3CTQA AAAGAGGAAG AGATTASCSQG AACTGTTCAA 1680 

J, J CXXSkTACTAC GAOCCTCTCC A&TGGAAGAfi CAGOCATTTT QAAAGAAGAG ACAACATGAA 1740 

TGACAATTIT CTCGAGQGI6 AOQAGQAAAA TSAGCTGACC TTGAAGGAGA AGAATTTCTT 1800 

CCCAGRATAC AACTATGACT GGTGGGACM AAAGOCClTC TCTOAQaATO TGAACTGGGG 1860 

GTAT6AGIVAG ftC?AAACCTC6 0CAGG6XCCC CAAGCTGGAC CTGAAAAGGC AATATGACAG 1920 

QGTQGCCCAA CT6GACCRGC TCCTTCACTA CAGGAAGAAG TCAGCTGAGT TTCCAGACrTT 1980 

D\) CTATOATTCr GAGQAGOCGG TGAGCAGOCA CCAGGAGGCA GAAAATGAAA AGGACAGGGC 2040 

TGACCAfi2iCA. GTCCXCSACAG AGOACSQAGAA AAAAGAACTC GAAAACTTGKS CTQCAATGGA 33.00 

TTTGGAACTA CAGAAGATAG CTGAGAAATT CAGCCAAAGG GGCTGACTGT CATTGGA608 2160 

GXGGGCACTG T7A2\GAAGCA GCCATCACAT GATCTGTTTT TCAiXlACTTC ACTT^^AAGAC 2230 

ACCATTTATA TACCCAAGGG CAGAAAGTAG AACTTACTAT TCATTAAATG TTTGACACAA 22 BO 

35 TTGGAATXGX CTTTAATTTC TOTGAGAATQ CTATTGAAAA TGlGAATTGC ATGACTTQTA 2.340 

GCATATTCTT TTCTOCAAAA TAGACATA'XT AACATOCTTA TQACAA1!GAC TGTGCXACIG 2400 

TCTTTGGAAA AATGOlTTGTC TCAOTTGGAA ATAATAAAAG ATTCAOCTGA GAlCC 2454 

Se(j ID NO: C144 CSIA Sequence 
4U Kuclelc Acid Accession #! XM_093 082,1 
Coding sequence : 93 . . 198B 

1 11 21 31 41 51 

I 1 I I I I 

HD CrrTCTlGTGG TAGG8ACCTC TCCTCAGTAT TTGAAACZAA OC3U3CATCTG ACAGATTTOG 60 

AaTTTQTAAA AAATAQCCTC GAAGATIC3US OAATGAAGCT TCIGTGTSAA GGATXAAAAlC 120 

AGCCCAACTG TGTATEAO^ ACATTGAG6T 6GTA00GGTG CCTTATCTCT TCTGCTTCTT 180 

GTGOqGCTCr AGCAGCTGTT CTTAGCACCA GTC3«3T0GCT CACTGAACTG GAATTTAGTC 240 

AGAC3WkACT GGAAGCITCA GCTTTGAAAT TGCTCTATGG AGGCTTAAAA GATCC3UUV.TT 300 

DU GCAAATIACA. GAAGCTCAAC TTGCAGTTTX CITTATCIGT AACOGCTGCA AAACTTOCAG 360 

TTG8AAXGST TGGAft lOTG T TCTGSTTTCS 06GGA3GKTT GG^TGCAATCI CATTTTQGCT 420, 

ACTGTCAGC3A CROTTC3TTC AAAT6!rGATC rTtPTGTAAQCT GCrCTGSOCT TOCACCAGAG 480 

TTGCrrGCTGC AAAGGATTGT QGQAGTCCTA AQTCCTTOCT ATCJWSAAGGG CTGAACrrGGG 540 

^ _ CAGGAAGACT TGAQQCAGTG GAGGAGGTTT TGGGGTTGGQ GGTSCTTOTA CAGOCCGGTG 600 

DJ A.CCCAGCATC TCAOGGTGGG GGGCAXTGTG AAAACIATGG GTCTTTTAGA GACnTGGTGG 660 

aCTTAGAAGT CAAGOCftOAA CWVGOCTGA GAAAAGGT6G TATGQATCTC CAOAGACOCA 720 

CCCTACAAGT TGK3CTCCTT TGCAAAATCT TCTCCCTCRA ACTATTTCTC TTTATTGCAT 7Q0 

TGCCTAATTC TCCTGGTCAG GTTAGTGTGG TGCAAGTGAC CATCCCAGAC OGTTTOGTGA 840 

AOBTOACTGr TOGATCTAAT GTCACTCTCA TCTGCATCTA CACCAOCACT GTOGGOTOCC 500 

OU GAGAACAGCT TTCCATOCAG TGGTCTTTCT T0C3WAAGAA GSAGATGGAG CCaUOTTCTT 960 

CrOCXTGGGA GQAGGOQAAIS TGGGGAGAI6 TTGAGOCTGT GAAGOaCI^ CTTGATGGAC 1020 

ABCABGCTGA ACTOCAGAXT tTAiClTTTCTC AAGGTGGAGA AGCTGTAGCX! ATOGGGCAAT lOBO 

TTAAftGATCJG AAlTACAOOQ TCCAACGATC C3USGTAAT6C ATCTATECACT ATCTOGCATA 1140 

y-^ TGdAQCtBVGC AGACAGTGGA ATTTACATCP GCXSATGITAA CAACCCCXX3V GACmTCTCG 1200 

03 GCCRAAACCA AGGCATCCTC AACQTCAGT3 TGTTAGTGAA ACCTTCTAAO CXXCTTTGTA 1260 

GGGlTCAaGG AAGACX^SAA ACIGGCCACA CEATITCQCT TTOCTGTCTC TCTGCGCTTG 132D 

QNiCMXTSC CCCrGTGXhC TACTGGCATA AACTTGASGG AAGAGACATC OTaCCAGTOA 1380 

AAGAAAACTT CAAOCCAACC ACGOGGATTT rrGffiPCATTGG AAATCTGACA AATTTTGAAC 1440 

AAGGTTATTA OCRGIGTACT GCCATCAACA GACTTQGCAA TAGiaJCSCTGC GAAATCSATC 1500 

/U !FCACTTCTTC ACATCCAGAA GTXGGAATCSl TTGTTG6GGC CTTGMIGGT AGCCSOGTAG 1560 

GTGCCGOCKX CATCAXCTCT GTTGTGTGCT TOQCftAGGAA. TAAGGCAAAA GCAAAGGCAA 1620 

AAGAAAGAAA TTCTAAGACC ATOGCGGAAC TTGAGCCAAT GACAAAGATA AACCSC2AAGGG 1680 

GAGAAAGGGA AGCAATGCCA AOAGAAGAOG CTACOC!AACT AGAAGTAACT CTAOCATCTT 1740 

_ _ CCATTCATGA GACTGGCOCT GATACCATCC AAGAAOCAGA CTATGAGCCA AAGCCEACTC 1800 

AGGAGCCTGC CCCAGAGCX:!r GCCCCAGGAT CAGAGOCIAT OGC3UQTGCCT GACCTTGACA 1860 

TGGAGCTGGA GCZGGAOCCA GAAACGCAGT GQGAATTGQA GOCASAGCGft. GftlGGCMSUSC 1530 

CAGAGTCAGA GCCTOGGGTT GIAGTTGAGG OCITAAGIGA AGATGAAAAO GQAGTCQTZA 1980 
AGGCATAG 19B8 

80 Seq ID HO 7 C145 DNA Sequence 

ZVtideic Acid Accession #£ FGENSSS predicted 
Coding segiience: 1..1242 



1 XI 21 31 41 51 

1290 



wo 03/042661 



PCT/US02/36810 



1 I i ) i ) 

AT(3CT<»rTCG CATTTTGGAA GGTCTTXCTG ATCCTAAGCT GCCTTGCAGG TC3U5GTTAGT 60 

GTGQTGCAAa TQACCATCCC AGAOGGTTTC QTGAACGT6A. CTGTTGGATC TAATGTC&CT 12 P 

^ CTCATCTGCA TCTACACCAC CACTGTGGCC TCC06AGAAC J«3CTTTCCA.T CCAGIIGGTCT 180 

D TTCTTCCATA GGAGCCAATT TCTTCTCCTT GGGAGGAGOG GAA3TCGOCA. 24 (> 

GATGITGAOG CnSTGAAGGQ CACTCTO3AT GGTkQUSCAGG CTQAACTCCA GAXTTACTTT 100 

TCTCRAGGTG GRCAAGCTGT AGCCATCX3GO CAATTTAAAG ATCGAATTAC AGG6TCCAAC 3G0 

QATCCRGGTA ATtSCRTCTAT C3VCTATCTCG CATATQCaVQC CAGCAQIiCAS TC3QAATTTAC 420 

ATCTGCGATG TTAACAACCC CCCAGACTTT CTCGGCCAAA AOCAAGGCAT OCTCRACGTC ABO 

lU AGTOTGTTAO TOAAACCrTTC TAAGCCCCTT TGTAGCQTTC AAGGAAQACC AQAAACTGaC 540 

CACftCTATTT CCCTTTCCT6 TCTCTCTGCG CTTGGAACAC CTTCCOCTGT GTAjCTACIGG COD 

CATAAACTTa AGQGAAOAGA CATCGTQCCA aTGAAASAAA ACTTCAACCC AACCAOOGGG 660 

ATTTTGGTCA TTGGAAATCT GACAAATTTT GAACAftSGTT ATTACCAGTG TACT6CCATC 720 

AACAOACTIO GCAATAGTTC CTOCOAAATC tSATCTCAjCTT CTTCACATCC ASAaSTTGGA 780 

ATCATTGTTG GGGCXTTTGAT TGGTAGCCTG GTAGGTGCCXS^ CCTiTCATCAT CTCTGTTGTG 840 

TtGCTTCBCftA GOAATAAGQC AAAAGCAAAG GCAAAAGAAA GAAATTCTAA OACXSVTCGCCJ 90 D 

GAACTTGAGC CAATGACAAA GATAAACCCA AGGGGRGAAA GOGftAGCAAT GCCAfifiAGAA 960 

GACGCTACCC AACTAGAAQT AACTCTACCA TCTTCCATTC ATGAjSACTGQ CCCTGATACC 1020 

ATOCAAGAAC CAGACTATGA GCCAAAGCCT ACTCftGGAGC CTGCOOCAGA GCCTGCCCCA 1080 

GGATCAGAGC: CTATOaCAGT GCCTGACCTT GACATQGA3C TGGAOCTGOA GCCAOAAAOG 1140 

CAGXOSGAAT TOG&SCCAGA GCCAGAGCCA GAGCCftGAGT CAGAGOCrGG OSITGTAGTT 1200 

GAGCCCTTAA GTGAAQATC3A AARaC3GAGT(3 QTTAAQGCAT AG 1242 

^ Geq ID NO: C146 rofA Sequence 

jLD Nucleic Acid Accession 4fx 19M_003020.1 
Coding sequence: 2 9.. €64 



15 



20 



30 



X 11 21 31 41 SI 

I I I I I i 

CBCTCCTXSGG OCTGCCKCTC CaTTGACAAT GGTCTCCAGG ATGGTCTCTA CCATGCTATC 60 

TGGGCTACTG TTTTGGCTGG CATCl:OGA:l:G GACTCCAfiCA TTTSCTTACA GCCCCCGQAC 120 

CCCTQAC!06G GTCTCAOAAG CAGATATCCA GAGGCTGCTT CATGGTGTTA TGGASC7VATT 180 

GGGCATIGGC AGGGCCCGAG TGGAATATOC AGCTCACCAO GOOlTQAATC TTGr OG OOOC 240 

CCAGAOCATT GAAGGTGGAG CTCATQAAGG ACTTCAGCAT TTGGGTCCTT TTGGC3\ACKr 300 

OJ OCCCS^ACAXC GTGGCAGAGT rrGACTGGAGA CAACAITCCT AAGGACTTTA QTGAaOATCA 360 

GGGGTACCCA QMXXTrCCAA ATCCCTGTCC TGTTGGAAAA ACAGATGATG GATGTCTAGA 420 

AAACAGCGCr GACACIGCAG AGTTCAGTGG AGAGTTOCAG TTQCACC3UGC ATCTCTTTGA 4B0 

TCC3Bt3AACAT QACTATCCAG GCTTGGGCRA GTGGAACAAG ARACTCCTTT AOGA6AAGAT 540 

GAAGGGA3GA GAGAGACGAA AGOGGftSGAG TGTCAATCGA TATCSACAAa 0RC3W3AiaACT 600 

4U QOATAATGTT OTTQCAAAGA ASTCrGTOX; CCATTTTTCA GATGftGGATA AGGATCCAGA 660 

GTAKAGAGAA GATGCIAGAC GAAAACCCAC ATTACCTGIT AGGCCTCAGC ATCK3CTTATG 720 

TGCACQTBTA AATGOACrTCC CTGTGAATGA CAGCATGTTT CTTACATAGA TAATTATOGA 780 

TACAAAGCAG CTGTATGTAG ATAGTGTATT GTCTTCACAC GGATGATTCr GCTTTTTOCr B4D 

AAATTAiSAAT AAiaAGCTTTT TTOTTTCTTC GGTTTTTAAA ATGTGAATCT GCAATGATCA 900 

TAAAAAITAA AATGTGAATG TCAACSU^TAA AAAGCAAGAC TATOAAAGGC TCAGA1TTCT 960 

TGCKGTTTAA AATGGTGTCT GAeGTTGTAC TATTTTGGOC AAGTCTGTAG AAAGCTGTCA 1020 

TTTGATIITG ATTAXGXAGT TCATOOUSCC CTTGGQO^TT QTTATACACC AGTAAAGAAG 10 BO 

GCTQTACTCA AQAGGAiGGAG CICACftCATT TCACTTGGGT GOGTCITAAT AAACATGAA7 1140 

GCAAGCATTG OC 1152 



45 



50 



55 



Seq ID HOs C147 DKA Sequence 

HUclelc Acid Aocesslan #s MH_024021.2 

Oodtng aequences 144.. 806 



1 11 21 31 41 51 

I I I I I 1 

AAGAT TCCTG CAAATGQTTT CAATATATGC AOATGTCTCG ATATAGGAAT OAAATTAOOT 60 

CTTTaaAACA ACTTAAATAA GrCAAATATA CTTGGRGCTT TAAAAAITAA AAQGAGA6A6 120 

OU ATTCQAGCAC CTTTTCIGCT 6CCATGACAA GCATGCAAGG AATQGAAC3U3 GCCA1X3CCAG 160 

GGGCTGGGCC ^GGTGTOCSac CAGCTOOOAA ACKTOCSCTaT CATAGATTCA CATCTGTGGA 240 

A2M3GATTGCA AGAGAAGTTC TTGA&CGGA6 A7UQ0CAAAGT OCTTGGGGTT GIGCAGAITC 300 

TGACIGOCCT GATGAGCCTT AGCSVTQGGAA TAACAATGAT QTOTATGGCA TCTAATACTT 360 

^ ATQQAAQTAA CXXXATTTCC GTGTATATCG GGIACACAAT TTGGGGGICA GTAATGTTTA 420 

03 TTATTTCAGG ATOCTTGTCA ATTGCAGCAG GAATZAGAAC TACAAAAGGC CTOQirCCGAa 4B0 

ST ftOrC TftaS AAXGAATATC ACCAGCTCTO TACTGGCTGC ATCnSGGATC TTAATCAACA 540 

CATTraaCTT OGOGTrrCAT TCATTCCATC AOCCTTACaiG TAACWACTAT GQCAACTCAA 600 

ATAATTGTCA TGOGACTATG TCCATCTTAA TGGGTCTGGA TGGCATGGTG CTCCTCTTAA 660 

GIGUGCTGOA ATTCTGCATT GCTGTGTCCC TCSCTQCXITT TOGArCfKAAA GTQCTCTSTOr 720 

GTAg^CgGq TGGGGTTGTG TTRATTCrGC C3VTCACATTC TCACATQGCA OAAACaGCAT 780 

CrCCCACACC ACTTAATOaQ GTTTOAGGOC ACCftAAAOAT CMCAQAlCAA AXGCTOCAGA 640 

AATCIATGCT OACTOTGACA CAAGAGCCTC ACATGAGAAA TTAOCAGTArr GCAACTTOGA 900 

TACTQAIAGA CTTOTTGATA TTATTATTAT ATQTAATCCA ATTATSRACT OTGTGTGTAT 960 

„^ ASAGAGATAA TAAATTCAAA ATTATQTTCT CATTTTTTTC OCTGGAACTC AATAACTCAT 1020 

/D TTCAi:n:GGCT CTTTATOGAG AGTACTAGAA GTTAAATTAA TAAATAATBC ATTTAATGAG 1080 

GCAACAGCAC TTQAAAGTTT TXCSITTCATC ATAAGAACTT TATKCAAAGG CATTACATTG 1140 

GCAAATAAGG TTTGOAAGCA GAAOAGCAAA AAAAAGArAT lOTTAAAATB AGGOCTtXAT 1200 

GCAAAACACA TACTTCCCTC CXATTTATTT AACTTTTTTT TTCTCCTACC TATGGGGACC 1260 

AAAGTGCTTT TTOCTTCAGa AAfiTGGAGAT GCRTGGCCAT CTOCCCXIIXIC CTITTTCXTTT X320 

oU CTCCTGCTTT TCTTTCOOCA TAGAAAGXAC CTTGAAOTA0 CACAQTCCGT CCTTGCAirGT 1380 

GCACQAGCTA TCATTTQAGT AAAAGTATAC ATGGAS-TAAA AftTCATATTA AGCmCAOAT 144D 

TCAACTIATA TTTTCrATTT CAUTCTTCTTC CTTTCOCTTC TCCCACCTTC TACTGGGCAT 1500 

AATTATATCT TAATCRTATA TGGBIAATGTO CAACATATGG TATTTGTTAA ATAOGITT6T 1560 

TTTTATTGCA GAGCAAAAAT AAATCAAAIT AGAAGCAATA AAAAAAAAAA AAAAAAAAA 1C19 

1291 



y/O 03/042661 



PCT/US02/36810 



5 

10 
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Seq XD WOj C14B DKA Sei^ence 

IfUCleic Acid Accefision 19M_OD2091.I 

Codlzsg B^quence: 56*. 502 



1 
I 

AGTCTCTGCX 

AGCGGTCCOG 

TGAGAGAGG6 
GAATTTGCTQ 
GGCCTTGOGC 
AQQTTCAAAA. 
CCCCCAGCTG 
TAAOACSACTa 

CTTCTQOTTT 
TTTTSATATC 
TAAAAGC^A 



11 
I 

CTTCCCAGCC 
CrrCCCX3CTC3G 
CTCCCTGCGG 

AGCCTGAA6C 
GSTCTCATAO 
AATCAGCW3C 

AACCAGCAAT 
AGTTCTQCftA 
CTATTCTGTA 
AAACXTGTTT 
TAeSGCTACCT 
AACACAT 




GCGGRGGGAC 
TAATOGOOAA 
AGCftfiCTGAG 
AAOCAAAGOA 
CTTOGnCGGA 
GTAGACTCTC 
GATAATGATG 
QCATCAOTTC 
TCTXTC2VT0C 
GCTQTOaACA 
GTTG6TTAGA 



31 

1 

GCTCCAAGGG 
GOTQOTGCTC 
CGTGCTGACC 
AAAOASCACA 
AGAGTACATC 
QAACAQAAAC 
TTCAGAGGAV 
TQCTCCAGGT 
GCCTCTCTCA 
TACGOATCAT 
TTGAGTAAftT 
ATTGTGQAAA 
TTCAA6GCOC 



41 

I 

TGCCTAGCQC 
AAGATGTACC 
CBOQGAjSTCTT 
AGGTGGGAAG 
CACCAGC3CAC 
AGCAGCAACT 
TCTCRACQTQ 
AAAGAGAAAA 
CAACAASMrr 
TOGTGATTTT 
AGAGTCITOC 
OGAGCT6TTA 



51 
I 

GGACCATGOG 
CCCOGGGGCG 
CGCGCGGCAA 
CTTCTGTTTC 
AAGCTGCAAG 
CTCftACCCAA 
TCAAAGA1GT 
AAGGAAGGAA 
ACAAAACCCC 
TCCTTGTGCA 
CAAGCAGCAT 
AATTAATGCT 
CCATTCACAA 



Seq ZD ISIOi C149 DHA Sequence 
Nucleic Acid Accession fts 13M_0I2261.1 
Coding sequence; 203.. 1045 

1 11 21 31 

I I 1 I 

GATTTGCTCT GOCRSCTVGCT GT0GGTCC06 0GCT06ACAC 
ACAQAATACG CGCTCCCTCC CTCCCCCTTC TCTGTCCCCC 
CACTCX:!AGOG GCGACTTTGA GQGATTCCCT CTCTGGOGGC 
CCICATTC66 GGCACXGGGA GTATGGA^rCX CCAAGGAABA 
ACTTGGAGTT CTCCTGATGT TGTTCCATAC AATGGCICAA 
□GAAAATCTC TCAGGCCTTT CCACTAACCC TOAAAAAGAT 
TGGGACGAOG TGTCTCATGG CAGAGTTTGC AGCCAAATTT 
OGCCAGCAAC TACGTASATC TOATCAJCAGA ACAGGCCBAT 
TGAGGTGAAG GG0C6CTGTG GCCACAGCCA GT066AGCIG 
CGCATATGCA. CTCAAAATGC TCTTTGrAAA GGAAAQCCAC 
GGCGACTTGG AGGCTQAGCA AAGTGCAGTT TGTCTAOGAC 
CAAAGACGCR GTCAGTGCTG OGAAGCACAC AGCCAACTCQ 
CACOGCOGCT OGGAAGTCCT ATGAGTGTCA AGCTCAACAA 
TGAXCGGCAG AAGAOGGTCA CCATGATCCn: GTCTQOGQTC 
TATCTCIUAT TTTGTCTTCA 6TGSAGAGCA TAAATGCCOk 
GGAAGAAACC rEGCOCClGA TTTIGGGGCT CKTCTTOGQC 
06CGATTTAC CAOGTCCACC ACAAAATGAC TGCCAACCAG 
ATCCCAGTAT AAGCACATOQ GCTAOAGGCC: QTTAGGCAGG 
CCAACXGGAT CAGGTAGAAC AACAAAAGCA CTTTTCCATC 
CATAGCISACA ATCAATUCRGG CClGaOTATC TGAOGUT'IGC 
AACCCAOGGA AC36GGGAGAC TCTITGGGAT TTGTAGGGT6 
ATGCTOGGGA GGAGGGGAfSS AGGGTCTCAG ACAGCTTTCG 
TQACTCTC3CA AAGAGCAATA AATGCCACTT GGAGCTGTAT 
TTGAA AACftT 6CTTCTTTGA OGAGGAAACC CQnTTAGGTT 
TCCTQCCnG OAiOUCAaCZQ GCrmTOCTA TACAGCTGTC 
TCAT6CTCOC TGCAGC3UU3A CGGCTGAAAG TGATXCAOIGC 
GTTTA61X3AT TGTCTTGGGA ATGTTTCACT GCTACCCGCA 
AAAAC3ACTA ATOTAACTAT GCAGAGTTGT TTGGACTTCT 
GGGGGACCI6 AAlGAATCAAT CTOT(?rQA9T CnTFTTTTCA 
TTCrCTOGC 

Seq ZD HOt C150 DKA Sequence 

Nucleic Acid Accessl<»i aH_003226.1 

Coding aequenceE 2««226 



41 
I 

C6AGTCXn!AG 
GOCCCTCGCT 
CTCTGCAGCA 

AXCATGGCAG 
ATATTTQTQG 
ATTGTACCTT 
ATOQCATTGA 
CAAuiVX-iCT 
AACATGTCCA 
TCCT0GGA6A 
CRCCACCTCT 
ACCATTTCaC 
CACATCCAAC 
GTGQVTGaGC 
grOOTCATQL 
GIGCAGATCC 
CACCCOCTAT 
TTGTACAOGA 

ttgqcttgtg 

AAATGGCAAT 
TGCTCATQGT 
CTQGOOCCAA 
CAGAAGLAATA 
AATGCACACA 
TTCIGGCrGG 
TCCAQCGACT 
TCCT6TG0CA 
AAATGAAATA 



CD 

120 

IBQ 

240 

300 

360 

420 

480 

540 

€00 

660 

720 

780 

797 



SI 
I 

CTAGGCGCTC 
CACCCCGGOC 
GCACAGCCGQ 
GCAIOGACAG 
AACAAGAAGT 
riGCGGGAAAA 
ATGATGTGTG 
CCGGGGGAGC 
GGGTGGATCG 
AGGGAGCTGA 
AAACCCACTT 
CTGCCTTGGT 
TGGCCTCTAG 
CTTTTGACAT 
GGGAGCAACT 
TGGTAACACT 
CTOGGGACAG 

TccraCTocc 

GATACACCAA 

tccatgctta 

TATTCTCTCC 
GQCTTG6CTT 
AGTTTAGGGA 
TGGGGIGCIT 
GAATACAACC 
O^TTCIGCAT 
OCAGCACCAG 
GGTCCAAGTC 
AAAOUIACXA 



1 
I 

GATGCTGGGG 
GTCTGCAAAC 
CACGCOChAiG 
TTOGTOTTTC 
COCCTGQGAT 

ctcagttttt 

CTGATGTCTT 



11 
I 

CTGGTCCTGG 
CAGTGTGCCG 
GtUSTGCAACA 
AAGOCGCTGA 
GOU^aCTOAG 
CTGTCCCTTT 
AACOAATAAA 



21 
] 

CCTTGCTGTC 
TGCOGGCCAA 
AOCQGGGCT8 
CTAGGAAGAC 
CACX^CTTGCC 
GCTCCGGGCA 
GOTCCCATGC 



31 41 

I 1 

CTCCAGLTLa.' GCTGAGGAGT 

GGACAGGGTG GACTGGOGCT 
CTGCTTTOAC TCCaQCIAXCC 
ASAATGCACC TTCTGAOGCA 
CGGCTGTGAT TOCTGOCAGG 
AGCTTTCTGC rTGAAAfiTYCA 
TCCACCCO 



51 
I 

ACGTGGGCCT 
AOCCCCATGT 
CTGGAStGCC 
CCIOCAGCTO 
CACnSTTCAV 
1»TCTGGAGC 



60 

120 

180 

240 

30O 

360 

420 

480 

540 

600 

660 

720 

7B0 

840 

900 

960 

1020 

1080 

1X40 

1200 

1260 

1320 

13 BO 

1440 

1500 

1560 

1620 

1600 

1740 

1749 



60 

120 

180 

240 

300 

360 

39a 



fieq ID THOt C151 riNA Sequence 

nucleic Acld Accession 41: fiM_002993.1 

Coding sequences 64.,<,4Q8 

1 11 21 31 41 51 

I I I I I 1 

□GCACGAGCC AGTCTCCaCQ DCTCCACCCA QCTCAGGAAC CCGCQAADCC TCTCTTGACC 
ACTATGA6CC TCCCX3TCCAG COGGGCOGCC CGTOTOCOGG GTCCTTOGG6 CTCCTTCTGC 
GCGCXOCTCG GGCTGCTGCT CCTGCTOAOS CC3GCOGOG6C CCCIOGOCAO CaCTGGTGCr 
GTCTCIOCTQ TGCTGACAGA GCTGGBTTGC ACTlGTTTAC GOBTTACaCT GAGAGTAAAC 



60 
120 
180 
240 



1292 



wo 03/042661 



PCTAJS02/36810 



10 

15 
20 



CCCAAAACGA TTGGTAAACT GCAGGTGTTC CCCGCAGGCC CGCAQTGCTC CfLAGCrrGOAA 300 

GTQGTAGCCT CCCTGAAGAa. CQGQAA3C5CAA GTTTGTCTGG ACCCGGAAGC CCSTTTTTCTA 360 

AAGAAASTOV TCCMaAAAAT TTTGGACAGT GOMUVCMOA AAAACIGAGT AACMAAAAS 430 

ACCATGCATC AXAAAATTGC CCJMSTCTTCA GCGQWSCAGT TTTCT0GAC3A TCCCTGGAOC 480 

CAGTAAOAAT AAOAAGGAAG GGTTGGTlTT TTTCCATTTT CTACSO'QGRT TCCCTACTTT S40 

GAAQAGTGTG GGGGAAAGCC TACaCTTCTC CCTGA3M5TTT ACAGCTCAQC TAATGAAGTA 600 

CTAATATAGT ATTTCCZAfrTA TTTACT6XXA TTTTACCTOA TAAGXTATTG AACCCXTTOG 660 

CAATTGAOCA TATXG1K3AGC AAAGAATCftC XOGXTAlTTAG TCTTTCAATG AATATTGAAT 720 

TG^LAI^ATAAC TATTOTATTT CtATCATACa TTCXTTTAAAQ TCTTACCGAA AAfiGCTQTGG *7flO 

ATTTCGTATG GAAATAATGT TTTATTAGTG TGCI6TTGAG GGAGGTATCC TGTTGTTCTT 840 

ACTQUZTCTT CTCATAAAAT AOGAAATATT TTAGTTCTGT TrTCTTGGGG AATATQTTAC 30 0 

TCTTTACCCT AGGATGCTAT TTAftGTTGTA CIGX3I3CTAGA ACACTQGGTG TGTCATACCG 960 

TTATCTOTQC AGAATATATT TCCTTATTCA GAATTTCTAA AAATTTAAGT TCTGTAAQQQ LD2Q 

CTAATATATT CTCTTCCTAT GGTTTTAGAT GTTTGATGTC TTCITAGTAT 06CATAAIGT 1080 

CATGATTTAC TCATTAAACT TTGATTTTQT AH3CTATTTT TTCZACTATAQ GATOACTATA XI 40 

ATTCrrCGTCA CTAAATATAC ACTTTAGATA GATGAAGAAG OCCAAAAACA GATAAATTCC 1200 

TC3ATTGCTAA TTTACATAGA AATOrATTCT CTTOQTTTTT TAAATAAAAG CftAAATTAAC 1260 

AATGATCTGT GCTCIGCAAA GTTrTGAAAA TATATTTGAA CAATTTGAAT ATAAATTCAT 1320 

CMTTXASTCC TCAAAATAXA TACAGCATTG CTAASATTTT CAGATATCTA TTOTGGATCT 13 BO 

TTTAAAGGTT TTGACCATTT TSTTATGAGG AATTATACAT (3TATCACATT CACTATATTA 1440 

AAATTGCACT TTTATTTTTT CCTGTQIGTC ATGTTGGTTT TTGGTACTTG TA^QTCATT 15GO 

TGGAGAAACA ATAAAACATT TCTAAACCAA AAAftAAATM AAAAAAA 1547 



25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



fieq ID NO: CI 52 BBIA Sequence 

Nucleic Acid AcceBalazL ^i NH_D05242.2 

Coding Beg[uences 148.. 1341 



C3C3GCCOGCCC 
TTTCTCTCGG 
OGGCCOGGOO 
GCCGQCAlfOC 
TCCTCTAAAG 
GGAGTTACAG 
AAACTOACCA 
AerAAGQGCA 
ATTTACAIGQ 
ATTGOCrATC 
ATTOGCTTTT 
CAGAGGTATT 
ATTOaCATCT 



11 
I 

TGOOGAGGCG 
TGGGTOCAGT 
TOGGGGCTTC 
TGCTAGCAGC 
GAAGAAGCCT 
TTG?iAAGAGT 

TGGCCCXGtTG 

ACATACATGC 
TCTATOOCAA 
GGGTCATCGT 



QAGCAGCTCT 
CTGrrCOCAG 
TCTGCCATGG 
GTCCTGGCCA 
TTTCTOAmiA 
CTCTCTACCC 
AQGOATCAIO 
GTATOCCTCA 
ACTGTTAAGA 
TOQAACCrGT 
ACATACCAOC 



OCATCITCAT 
TGGTGGGAGA 
OCTTCCTCAC 
ATGAAAACTC 
TGTAOCIGAT 
ASAGCC3U9G(3 
TTAACAGCTG 
CnAAQAAOQC 
CCTCAhAGAA 
GCTCCTATTO 
TTAATGraAT 
G 



2X 
I 

CX3CAI3CRQAO 
GGAGCTCTGA 



TATTGGTAaG 
CTTTTCTGTG 
TCCAA'X-i'uTC 



CTTGQCTQAC 

CAACAACTQG 
CATGTACTQT 
GAACCGCATG 
ATQQCTQCTB 

CATGTTCAAT 
AiSCCTCTGCC 
AGAGAAiSAAA 
CXGCnCACX 
CXAGAGOCAT 
CATOGftCCCC 



ACACrCCAGG 
GAQGAOGTGT 



31 
1 

GCTCOGATTC 
GTTTCGAATC 
CGGAGGGOCA 
TOCnOTSGCA 
GTTGATGGCA 
GAXGAGTXTT 
TACACAATTG 
TTCCeAACTA 
CTOCTCPCTG 

ATrcAioeas 

TCCATTCTCT 
GGGCACTCCA 
ATTCTGCTGG 
A3\£!h.TCACQA 
TAJCTTCCTCT 
TATQTQCTGA 
AGGAAGAGGG 
CCTAQVAACC 
GTCTATGCCC 
TTTGTCTATT 
C5AAGTOTCC 
AAATCCAGCr 
OTCCTCROAT 
CrSTTATTTC 



GGGGCAGGTG 
GGTG6OGG0G 
^GOGSOGTG 
CCATCCAAQQ 
CATCCCACGT 
CTGCATCTGT 
TGTTTGTGGT 
A(3AAfiAAGCA 
TCATCTGGTO 
AAGCTCTTTG 
TCATGACCTG 
GOAAOftJUSGC 
TCAOCATOOC 
OCTGTCnStSA 
CTCTGGOCAT 
TQATCAaAAT 
CCATCAAACT 



TGTACKTTGT 
ACTTTQTTTC 
GCACTGTAAA 
CXTAjCiTCrTC 
O0C3AATTGCA 
CTAftSCAAAA 



SI 
1 

AGAGGCTGAC 
OATTCCCCQC 
GCTGCTGGGG 
AACCAATAOA 
CACTGGAAAA 
CCTCACTGGA 
GGGTTTGGCA 

O0C3CTTGAAG 
TAATPTOCTT 
CCTCAGTGTG 
AAACATOXKX: 
TTrGTATGTC 



GCTGCGATCT 
GATTGTCACT 
GCrJXJUATTAT 
AGCCCTCTGC 
ACATQATTTC 
GCAGATGCAA 
AAGTTCAACC 
CAGTAGGKTG 
AlBSTCFCACC 



Seq ZD ISO t C153 EKA Sequence 

Nueleic: Acid AcoeBsion ]!1H_00346».2 

Oodlzxg BGqu«ncd: 92., 1945 



1 
1 

QAAACQQCCC 
CATAXAAACA 
ASCAOOCCTQ 
TCAGAGAAAC 

IAA5GAAGAA 
AAAAOAAZU^T 
GATGAGAATA 
AQAAAATAAG 
TTATGACxACA 
TQAAISAGAAT 
TACTCCTCAA 
ACC!AAACAAC 
AGATGATATC 
GAAC30C3iaTA 
OAAXATAGGA 
OCAGGAAGAA 
AA'OrTGCCrAT 
AAA3GSOGAA 
GCTQATTGAA 
AACTQQGGAG 
CCTTAGATGSVC 



11 
1 

GAGAAGCTCX3 
AAAAGAGGAA 
TCTCTTATGC 
CAGCTGCTTC 
QAAATTQATCA 
AGCAGCCCAG 
QCK^TQAAA 
ATACTCS3AAG 
CCCTATGCCT 
CAGCAGTGGC 
TOCAOOGATA 
A60CTTGCTA 
CAGAAACGTG 
TACAAGGCTA 
GAGGAGAAAA 
AAAAATGAAC 
GATCTTCGGA 
TTGAAAAGGT 
AOGGCCACCA 
ATCTCAASGA 
AAGCGGAATO 



21 
I 

OCOGGAGAAC 
ATCTT^CAAA 
CTTTAATTTT 
AGAAAGAAGC 
GGGCTTTGQA 
ATTATAATOC 
GGCACTTGCC 
CTTTQAGACA 
TCAATTCAQA 
CAOAAAGAAA 
AOOCXrTTIAA 
CATTGGAATC 
AGAGQAIGGA 
ATAACATlXaC 
TAGAGAGICA 
AAATCAACQA 
AAGAGAGTAA 
TAOXAAATGC 
GGCITZTTIaA 
ATTTACAGAT 
OKTCAGTOGA 
CTGACTTSOA 



31 
I 

o<^;gag6aat 

CA.TGQCTC5AA 
OCTCATCTCT 
AGACCTCAC3G 
GTACATAGAA 
CTAOCAAGCT 
OSAGAGGOAT 
GGCTGAAAAT 
AAAOAACTTT 
GCTTAACCAC 
ACGCZACAAAT 
TGTCTTOCRA 
TGASQAQCAA 
CTATOAAGAT 
AAOOCAGGAA 
TGAGATGAAA 
AOACXZAACTC 
TOCAGGAACTT 
GAAACCTCTT 
ACCCCCAGAA 
AC0GGMSC9GG 
0C31TGCAGAC 



41 
I 

ATGCTOTQGA 
GCAAA0ACCC 
OGOGCTGAAG 
TTGGAAAATG 
AAOCTCOGAC 
GTCTCrGTOC 
TCACFGASTG 
GAGCCTCAQT 
0CAA1GGACA 
ATOCAATTCC 
GAAATAGTGG 
GAJ3CreG(5QA 
AAACTTTATA 
OTQGTCGGGG 
eAGGTGAGA£3 
CGCTCAGGGC 
TCA6ATGATG 
OGGAGGTTAC 
OATTCTCAGT 
GACTTAATTQ 
QAGCTTGAGC 
CrOTTCCAAA 



51 
1 

GCTCCrCTGC 

CAGCTTCATT 
TCCAAAASTT 
AACAAGCTCA 



AAGAAGACTG 
CJWZAOCAAA 
TGAGTGATGA 
CTCCTATGTA 
AGGAACAATA 
AACTQAC3U?G 
CGGATGATGA 
QAGAAGACTC3 
ACAGCAAAGA 

AGCxrrGGCiyr 

TCTCCSU^GT 
AGAATGG6CA 
CTATTTATCA 
AGATGCTCAA 
TTCCTGtZGA 
ATAOGAXaCT 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

B40 

900 

960 

1020 

loao 

1140 
1200 
1260 
1320 
1380 
1440 
1451 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

040 

900 

960 

1020 

10 BO 

1140 

1200 

1260 

1320 



1293 



wo 03/042661 



PCTAJS02/36810 



CTOCAABAGI OGCTACCCTA AAACAOCTG6 TCGIGCCGGO ACT^USGCCC TACCAGAOGG 1380 

GCTCASTGTT GAGOATATTT TAAATCTTTT AGOaftTGGAG AGTGCAGCAA ATCAGAAAAC 1440 

GTOGTATTTT CCCAATCCAT ATAACCW3GA G/iAAGTTCTa CCAAGGCTCC CTTATGGTGC ISOO 

TGQAAGRTOT AOATCBARCC AGCTTCCCAA AGCTGCCTGG ATTCCRCATG TTGAAAACAG 1S60 

D ACAGATG6CA TAIGAAAACC TGAACGACAA GGATOUUSftA ^TAGGTGAQT ACTTGGOCAG 1620 

GATGCIAICTTT AAATACCCTG AGATCATTAA TTCAAAGCAA GIGAAGCX3AG TTCCTOGTCA 1660 

AGGCTCATCT GAAGATGACC TGCAGGAAGA GGAACAAATT OAGCAGGCCA TCAAAGAGCA X740 

TTTGAATCAA GGCAGCTCTC AGGAGACTGA CAAGCTGGOC CCGGTGfiGCA AZUMSQTTCCC 1800 

TGTGOGGOCC CCX5AAGAATG ATGATACCCC AAATAGGCRG TACTGGGATG AAGATCTGTT 1860 

10 AATGAAAGTG CTGGAATACC TCAATCAAGA AAAGGCAGAA AAGGGTUVQGG AGCATATTGC 192 0 

TAAGAGAGCA ATGSARAATA TGTAftGCTGC TTTCATTAAT TACCCTACTT TCATTOCTCC 19SD 

CACCCCAAGC AAATCCCAAC ATTTGTCTTG AGIGIGTTQA CTTCTATCCT QTTAACACTG 2040 

TAATATCTTT AARTQATGTA C3«3aCAGATG AAACX^^GGTC ACTGGGGAGT CTGCTTCATX 2100 

TOCTCTGAGC TGTTATCTTG TGrATGGATA TGTGTAAATG TTATOACTCC TTQATAAAAA 216 D 

15 ATTTATTATQ TCCATTATTC AAGAAAC3%TA TCTATGACXG TGTTTAATAG TATATCTAAT 2220 

GGCTGTGGCA TTGTT^TGC TCACATATGA TARAAAAGTG TCCTATAATT CTATTGAAAO 22 BO 

TTTTTAATAT TTATTGAATT ATTTTGTTAC TOTCTBTAGC GTTTTGTGGA GTACTGGACC 2340 

AAAAAAATAA AGCATTATAA ATATA 2365 

20 Seq ID NO: C154 DNA Seijuence 

Nucleic Acid AcceBBion #t I3]>^_03D955 
Coding gequeuce: 327,,51DB 

1 11 21 31 41 51 

25 I 1 I III 

GAATTCCQGG AOCXaGGCOGG CTGOGAGGCX: GOGGGGCATG CGGGAGGOGG AGGOGTGGGA 60 

OOOGGTGGCr GCX3CCCATTC CACACCCJGCX: GAAAGOSGAC ACIGTCAGCr GAATCACTCC 120 

CCTTTTAGGA GGAGGGAQGO GGAAAAQQTG TCTA3CTAAT TTCTGCTTAA AAAAGCAO^ 160 

GAGAT0GCX3G GTCAGCTTTG CftGTOGCTGC CTTCXOtaCXSC CTGACCRTGC ACCCCTGCAT 240 

30 CTTCCTGCTG GGCACAGGOG AGCGCTTTAT TTCTOQftaCT GAGGGCXftAA ACTTTTTTCA 300 

CTTTTCTTCT CCTCAACATC TGAATOWGC CATCTGOCCR GAGGftGCTGG CTTGCAMCC 360 

TTTC06TGGT GGCTCAGCTC CXVAACTXTG OGOCOCTTTQ CTATGGGAGA CAGCCTCAGC 420 

CAOGCOOGGT TOGCTTCOCG GACAGGAGGC AAGAGCATTT TATCRAGGQC CTQCCAQAAT 4a O 

ACCAiCSTGGT GGGTCCAOTC C6ABTAQATG CCAGrOGGCMV TTTTTTGTCA TATGGCTTGC 540 

35 ACTATCCCAT C21CGAGGAGC AGGAGGAAGA GAGATTTGGA TGGCTCAQAG GACIGQGTC3T 600 

ACTACAGAAT TXCTCACGAG GAGAAGQACX: TQTTTTTTAA CTXGAGGGTC AATCAAGGAT 660 

TTCTTTCCAA TAGCTACATC ATGGAGAAGA GATATGGGAA CCTCTCQCAT GTTAAGATI3A 720 

TGGCTTCCTC TGCCCCCCTC TGCCATCTCA GTCWCRCQGT TCTACAGCAG GGCACCAGAG 780 

TTQQGACBaC AgCCCTCAGT GCCTGCCaVTC GACTGACTOG ATTTTTCCftA CTACCACATG a40 

40 G2^CTXTIT CAXTGAACCC GTGAAGAAGC ATCCACTGGT TGAGGGAGGG TAOCACCOGC 900 

ACATCGfTTTA aUSGAGGCAG AAAGTTCCAG AAACCAAGGA GCCAACCTGT GGATTAAAGG ^€0 

ACAGTGTTAA CATCTCXX!AG AAGGAAGAGC TATGGCGGGA GAAGTGGGAG AGGCACAACT 1020 

TGCCAAGCAG AAGCCTCTCT CGGCGTTCCA TCAGCA3U36A GAQATGGGTO GAGACACIOO lOBO 

TGGTGGCGGA CACAATWSATG ATTGAATACC ATBQGAGTGA GAATGTGGAG TCCTACATCG 1140 

45 TCACCATCAT GAACATOGTC ACTGGGrTGT TCCAIAACCC AJkOCATTGGC AATGCMVTTC X200 

ACKSSt&XTGT GGTTCGGCTC ATTCI»CTCQ AAQAAGAAGA GGAM3GAC1X3 AAAATAGTOrC 1260 

ACCATGCAGA AAAGACftCTS TCTAGCTTCT GCAAGTGGCA QAAGnOXATC AATCCCAAGA 1320 

GTGACCTCAA TCCTGTTCAT CRDORCgrGG CTGTCCTTCr CSVCCfiGARAG 6ACATCTGTG 1380 

CTGQTTTCAA TC3GCCCCTGC GAGACXCTGG GCCTGTCTCA CCTTTCBIGC5A ATGTGTCAGC 1440 

50 CICAOOGCAG TTSTAACATC AAXOAAGATT CGG{3ACTCX^ TCTGGCTTTC ACAATTGCCX: 1500 

AamOCTftGS ACACftGCTTC GQCaUTCCAGC ATGATGGGAA AGAAAATGA£! lX3TGaGCCTG 1560 

TGGGCAGACA TCOGTACATC KCGXCOQaCC AQCICCAGTA CSOATCCCACT CCGCTGACAT 1620 

GSTCCAAGTO CaS<3C3RGaaG TACATCACCC GCTTCTTGGA COGftSSCrGG GGGTTCTGTC 1600 

TTGATOACAT ACCTAAAAAG AAAGGCTTGA AGTCCAAiaaT CATTQCCXZCC QGAGTGATCT 1740 

55 ATGftXGTXCA GCACCa^GTGC! CAOCTACSWI ATGQACGCAA TGCTACCTTC TGCCAGGAAG 1800 

TACSAAAAOGT CTTQCCAGACA CIGTGGTGCI COSTGAAGGG CTTTTQTOGC TCTAAGCXGG 1S6Q 

ACSCTGCTGC AGATOQAACT CAATOTTOTG AGftAGAAGTG GTGIAXGGCA OGCAAGTGCA 1920 

TCACAC7TGGG QAAGAAACCA GAGAGCATTC CTGGAGGCTG GGQCCQCTGG TCACCCTGGT 1980 

COCACTGTTC CAGGACCTGT GOGacrGGAG TOCAGAGOGC AGAGAGGCIC TGCAACAACC 2040 

60 C!a3nGCC!AAA GTTTGQAGGG AAATATTGCA CIGGAGAAAG AAAAlGGCZAT aSCITGTOCA 2100 

AjCGTCX:ACOC CIGTOGGTCA <33U3GCACCA& GAXTTCiaaCA. OATGGAGTGC AGFGA2OTTS 21fi0 

ACACXGTTCC CTAGAAOAAT GAACTCTACC ACTGGTTTOC CATTXTTAAC OCAGCACATC 2220 

CTTGTGAGCT CTACTGOOGA CCCATAGATG GCQkGTTTTC TGAGAAAATG CTGG^GCTG 2280 

TCATTGATGG TACCCXJTTGC TTTOAAQQCaQ GCAACAGCAG AAATSTCTGT ATTAATGGCA 2340 

05 TATGTAAGAT GOTTGGCTGT GACIATGAGA TCGAXTCCAA TGCCMXXaAO GATOOCTGOO 2400 

GTGTGIGCCT GGGAGAXGGC TCTTCCIGCC AOACTGTGAG AAAGKXGTTT AAGCAGAAGG 2460 

AAGGATCTOG TTATGTTGAC ATTGGGCTCA TEOCAAAAGG AiOCAafiGaaC AlAASAOTGA 2520 

TGGAAATTGA GGGAGCTQGA AA<:^TCX:TaG CCATCAGGAG TGAAGATOCT 6AAAAATATT 2580 

AOCTGAATGO AGaOTTTATr ATCCAGTGGA AQOGGAACTA TAAGCT^GCSk GGGACTGTCT 2640 

70 TXCAGTATGA CAQGAAAIX3A GACCTOQAAA AGCTGATGGC CACAGGTGOC ACX^AATGAGT 2700 

CXOTGTOGAT CCAGCTTCXA TTCCftGGTGA CTAAOCCTGG CATCMOTAT QASIACAGAA 2760 

TCCAGAAAGA TGGOCTTGAC AATOATBTTG AGGAGASGTA CCTCTGGCAB TAGGGCCACT 2830 

GGACAGAGTG CAGTGTGAOC TGGGGGACAO' GTATCCGCCG CCAAACTGCC CATTGCATAA 2B80 

AGAAGOGCGG aSGGATGOTQ AAAGCTACA^ ItTTGTGACCC AGAAAGACAG OCCAA^TGGGA 2940 

75 GACAGAAOAA GTGCCATGAA AAGGCTTGTC CAOOCAOGTO OTGOGCAGOG GAOTGGGAAG 3000 

CATGCTOSGC GACAXQCGQG OCGSCACGGGG AQAAGAAGOG AACCaSiecIG TGCftlCCAGA 3060 

CCATGGTCTC TQACXSAGGAG GCTCTCCC3QC CCACAGACTG CCAGC3U:CTG CTQAAGCCCA 3120 

AGACCCTOCT TTCCTGCAAC AGAGAOVTCC TGTQCCCCTC GGACTGGACA GTGGGCRACT 3180 

^ GGAGTGAGTG ITCXGTTTCC TGTGGTGGTG GRGTGCOGAT TCGOUairGTC ACATGTCCCA 3240 

oO AGAACCATGA TGAACCTTGC GATOTGACAA GQAAACCCAA CAGGOGAGCT CTGTGTGGCC 3300 

TCCIU3CA«ra CCCrrCSAGC CGGAGASTTC T6AAAOC3UA OULAGGCACT ATTXCCAAatG 3360 

6AAAAAACCC ACCAACACTA AA1GCXX3GICC CTOCSlOCTAC ATOC3U5BCCC AOAATGCTGA 3420 

CCACACCCAC AOGGCCTGAG TCTAXGABCA CAAOCACCCC AGCAA^TCAGC AGCOCTAGXC 3480 

CTACCACAOC CTCCAAASAA GGAGACCTGO GIGGGAAACA GTGGCaU^T AGCIGAACCC 3S40 
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5 

10 

15 

20 

25 

30 

35 

40 

45 
50 
55 
60 
65 
70 
75 
80 



AACCTGAGCT 
CTTOCCAATC 

GCAACXCTAT 
T«3GAGATrCA 

atcx:tgtaat 

AAATGCCAC7 
TCAGCACW3T 

GAGACCACCA 
ACAACATGAA 
TGATTACT6A 
GCrCTGCACA. 
CCTACTGGAA. 
GACCTGACCC 
GGAGCAAGTG 
ACftGCCGGGA 
OOCC*TTQAB 
CSOCAGTGCTC 
GCCTCTGTGA 
GTCACTGOGC 
AGAGGATTQT 

AAAGTGCCGA 
AAGCCATGAA 
AGACACACAT 
AACTCTAAOC 



GAGCTCTCGC 
CTTOAGCATT 
GOGAOGCCTT 
CRCTTOaCCT 
CAGTGGCTCA 
ATGGACCAAQ 
TGCACCTCCA 
AATGSAAGGA 
TGAGGGGATG 
GCCAOAACCC 
CCAAACAAAA 
GGGCTTTTTG 
CTGGATCGTC 
AAGGBTGGAG 
TGCAAAAAGA 
CTCCAQAAAC 
CCACCGGAAC 
CATQRQCTGI 
CAGGTCCTGT 
TTEGACAAAA 
CACTGGGAAC 
CCAATGTGTG 
CAAACCCAGA 
TTTACTTTGC 
GAAATGTTCT 
CACACACAOC 
CCAAA 



TATCTCATTT 
CAGCCAAGTG 
GTAGCTACAA 
GTQACTOCAT 
GGGGAAGAAA 
ATCAQAOTAC 
CTAACACCAG 
CTGCTCCCCA 
GTTACTGAAA 
TCAGOAAAGA 
AGTTCTGAAC 
CTAAATQCCr 
GGAAACIGGA 
TGCACCAdCCC 
TGCCACCTCC 
TGCAGTGGGG 
CTGAGGCCAT 
AACCCGGAGG 
OGAGGTGGAG 
ASaACCCACAT 
TGGGACXrrOT 
CCCXCAGAGG 
CCTCCAGAAT 
ACTAAGGACA 
QTQCGCSbCCB 
CAAAGGCAAA 



CCACTGGAAG 
AGGAAAATGT 
CAACAAGTGG 
TTTACAATAC 
GAGAACAGOC 
CTGGAAATGA 
ATCTCAGCAG 
GCC2UIASGCC 
AGCCAGCCAA 
CGGCAAACOG 
CAGTCCTOAC 
CCAATTACAA 
GCX5AGTGCTC 
AGATCGATTC 
GTCCCTGTGC 
GCTTCAAGAT 
TTCACTGCCA 
CCTGTGfiOGC 



CCACC3VTOTC 
GTTCCACTTC 
GCAATAAAAC 
TCAAAAAATQ 
AACTGTCAGC 



GAAGGCAAOG 



CACTTCX:CAG 
TTCC3VGTTCA 
TTCTGGCTTG 
CTTGACCT^AA 
TGAGGACAAA 
CGCT0OGT3 
GGAGTCCTGG 
CACTACTTCC 
CACTCTGCZTC 
TAACCAOCTG 
TGAGOAGQAT 
GCftGCXCACA 
CACCACSVTGT 
XGACTGTGOG 
tqgctqgaaa 

ACGCGA6ATT 
GTTCCTGGOC 
GTGGCZU9GTG 
AQGAOTarrG 
TTGCAATGAG 
CTOTGQAQQT 
TGAAGA0CA2^ 
CAACCAGCAG 
CAGTTTCTGC 
GTGCTGCTTC 
GTTOCTCCAA 



CCCATCCTCA 
GATACTGGTC 
TC3VrCTTCCC 
GGTCCAGAAA 
GATGAAAGCA 
QAAAGTACAG 

tggc3cacxx:t 

GAAACnSGOA 

cctctgggag 
aaacttccaa 

eCRAjCAAfiTC 
AACGGCCAGG 



GCCATOCAGA 
GTGG6AAACT 
CAGTGOGTGG 
GGCATTCCTC 
6AGCCTTGGA 
TGTCCMSGAG 
CACCTGTGCT 
GdCTTTCAGA 
GACCRA.TGTC 
GCCIGCAAGA 
CAGACACTGA 
TOGTGTOCGC 
AAGTCAAAAG 



3600 
3660 
3720 
3780 
'3840 
3900 
3960 
4030 
4080 
4140 
4200 
4260 
4320 

43ao 

4440 
450O 
4560 
4630 
4680 
4740 
4800 
4B6D 
4920 
4980 
5040 
5100 
S115 



Seq XD 230: cxss DNA GeQuence 

Nucleic Acid Accession tfi im_00l062.X 

codizig sequence s 76 . , 1300 



GCICTCATTA 
TACACTGTTG 
TCTTTTATTC 
CTAAAACCTC 
AATGTTQTGT 
ATOCAACAAA 
GCCTTQATTA 
TAOCAOCTGA 



CTGTTCAATG 
AACTATTATT 
AOCI6TGTGA 
AACAOXJUaTA 
GGTCICATTG 
GACSA3TATA 
ATTTCTCAAS 
GGAAAGAOCr 

GTCAATTACT 
TCTGTCITCC 
ACAATGGAGG 
AATAATGACA 
GGTA6TTACG 
GCOCAAACirf 
TTATGOCTTC 
TTCTCIACATG 



11 
I 

CCTTCTGCCC 
GAGAGATGAG 
CAAGCCAACT 
TGTTGAATAC 
TGTCCCTCAA 
TCAAATACAA 
TACTGGCnr 
CTGACAAGCr 
CTCCCCVGAC 
GGAACTACTC 
TTGGTAQOCA 
AGAAGAGTCT 
TTTATACnAA 



ATGAAAA1GA 
GAGCATTCAG 
TCITGGATAT 
CTQATGAGOC 
CIGTGAGAAT 
TCAlSTGTSAT 
AGOGCTCATG 
GAACCTACOX^ 
TTGTCCGCAA 
17CCTCAGCT& 
TTCTTCATTT 
TEC7VATAAAA 



21 
I 

ATCACTTAAT 
ACAGTCACAC 
ATGCGAGATT 
AATGATCXSU 
ACITGTTGGA 
TGTGAAAAGC! 
GQGAGTATGT 
AGAAAAXAAA 
XAACZAGTAC 
AACCGCDfflUL 
GTTCTCAGIA 
AATAAATGGG 
GTCACTGSTA 
TAGCnCAGQA 
CTGGAATIQC 
TAATCCAAAC 
TAACAAA6AC 
TATAACTGIG 
GAATGAAACA 
GGAGAAAGCC 
6GGGCCCTAT 
GGAACTTCTG 
TGGAGAAAAC 
CATAAftASCC 
ATCCCAOXAC 
GTTGTlXaWA 



31 
I 

AAATAGCCAG 
CAQCTQCCCC 
TGTGAGGTAA 
TCAAACTATA 
ATCCAGATCC 
AGATTGTCAG 
CGTAACGCTG 
TTCCAAGCAG 



CTTGTCA7U2C 
GATACTGGIG 
CAGATCAAAG 
GAAAAGATTC 
GAAGCCATOC 
CAACAAACTC 
GCTGCMGCCC 
TCTTCTTGOG 
ACACCrCCTG 
TATTTCAC3CA 
ChGAAAA!CGA 
ATCACCTGTA 
AGTOGAGGOG 
TTGGAGGTCTC 
ATTTSC!AGT6 
QAGCTGQAQA 
GATTAAC 



41' 

I 

CCAATTCATC 
TAGTGGGGCr 
GTGAAGAAAA 
AQU3GGGAAG 
AAACCCTGAT 
ATQTAAGCTC 
AGGAAAACTT 
JUUklTGAAAA 
TGGAiOCTTTT 
ACTTCACXCC 
CAATGGCTGT 

(;:AGATGAAaa 

TGTCIGAGAA 



TGAATAlCivai' 
AGGTCTTACC 
TCTCTGCTTC 
ACTCACAATC 

ATGATACTAT 

ttcaqgqcct 

AACCACTGAG 
GCTGOAGCAA 
GftBT TCCATG 
QTTAATAACC 



51. 
I 

AACATTCTGG 
CjtTACTGTTT 
CTACATCCGC 
CAGOSCTGTC 
GCAAAAGATG 
GGGAGA6CTT 
ARTATATGAT 
TATGGAAGCA 
GGCCTTOTOT 
TGAAAATA2UL 
GCTGGCTCTG 
CAGTTTAA7U3 
AAAAGAAAAT 
TGTAtrGKCCA 
GCTCAGBGAA 
TGOCCTGATG 
AGGTAACXTC 
ATATATCTCC 
0CTAAIKEGGT 
ASTTGGTTTC 
ATOTOCCAAC 
CCAAGGAGCT 
ATACTAATAA 
TTTATTGTCC 



Seq ID £IO: CI 56 DNA Seciuence 
nucleic Acld Accessloa « s BM_00d591 
Godlzig seQuence? 59.. 349 



1 11 21 

I I I 

CACTCCCAAA GAACTGGGTA CTCAAlQlCTG 
(oTGCTGTACC 
OGCaCGAATCA 
rrCATCCTAAA 
TGCTAITCATC 
GGTGAAAIAT 
TTTCTGGAAT 
TTCACTTGCA 
CAATTAATGA 
GTTAAACXGI 
TAATTTTCX3i 
AAOATTATAT 
CrTTTGTTTA 
CATTAATAAG 



AAGAfiTTTGC 
GAAGCAGCAA 
TTTATTSTQG 
TTTCACACAA 
ATTQTGOGTC 
GGAATTGGAC 
CA37CATGGAG 
AGTTQATTCA 
ATITTATGTT 
TAAGCTAXTT 
GQACTTTCTT 
lATTGTTTTQ 
ACAAATATT 



GCAACTTTGA 
GCTTCACADa 
AGAAAAAGTT 
TCCrCAG-tAA 
ATAGCOCAJVG 
GGTTTAGTGC 
TATTGCATCA 
ATTTATAGCT 
TGGTTTAQTQ 
GCAAQCAACA 
TCTCCTAAAT 



31 
I 

AOCAGATCTG 
TTTGATGTCA 
CTGCXGTCTT 
GCAGCTGGCC 
GTCTGTGTGC 
AAAAGXCAAG 
AACAGAAAGA 
TTATCTAATT 
TAOTTTGCTT 
GTAGGTTTTC 
CAAASIATAA 
ASCTATTi'TT 
TGTTGEAATT 



41 

I 

ttctttgagc 

GTGCTGCTAC 
GGATTVCS^CAG 
AATGAAGGCT 
GCAAATO^Ul 
AACATGliAAA 
ACCTTOCTOS 
TGTGCCTCAC 
TGTTTAAGCA 
TGTGTTTAGC 
AATTAXATTT 
TAAAAAAACT 
GCATXATAAA 



51 
I 

TAAAAAC3C3Wr 
TOCACCTCI6 
ACOGTATTCT 
6TGACATCAA 
AACABACTTG 
AACIGTGGCT 
GGTTGGAGQT 
TGGACTXGTC 
TCACATTAZm 
TATTTAATAC 



Aim'AACATT 
ATAAGAAAAA 



60 

L20 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

a4o 

900 

960 

1020 

1O80 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1537 



60 

139 

180 

240 

300 

360 

420 

480 

540 

6O0 

660 

720 

780 

799 



SeQ ID mOs C157 DNA Sequence 

KUcIeic Add Accession #s MM_,0l3271.1 



1295 



wo 03/042661 



PCT/US02/36810 



Coding sequence t 27.. B09 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



TOC3QGAGCCA 
CC<3<3GGGCOT 
TCTGCQCGCG 

IGCAXSGJiGCr 
GGGCOTWSGC 
TCTGCSOeCGC 
CTGCAGOGCA 
CCCAGCTTGT 
AGGfkC36GCCC 
CCGAGCTGTT 
TGGCMCCOC 
CTGAGOGCGT 
TGCCTGOVOG 
CAGAAGTGCC 
TTACOCOGGC 

(aATClQAtaC 



11 
I 

GGCTCGCTGG 
CGGCCTTTTG 
GCCX3GTAAAG 
TOCTOGOOGC 

GCAGGAGGCT 
CCCCCGCAAC 
GCTCGCTCGC 

cccx:acGCcc 

OGOGGGCCCG 
QAGGTACITB 
GCGCCGOCTC 
GCTOGGGGCO 
COBCCTCTTG 
COCGCCATCC 
CAGCCAGCCC 



21 
1 

GGCAGCRTGG 
GTGCTGCTGC 
GAACCCCGCG 
TTCOOGCGGT 
CTGGCGCATC 
GAGGATCAGC 
TCrGATCOGG 
GCTCTGCTCC 



GATGCTGAGG 
CTOGGAOGGA 
OGC0GTGCCX3 
CTGCTGCGTG 
CCi^CCCTGAG 
OGCCACCAGG 
TCrCAOCOGA 



31 
I 

OOGGGTCGCC 
TGCTCOOCCT 
GCCTMGOGC 
CAGTGCCCOG 
TGCIGGAOGC 
AGGCGOGOGT 
CTCTQGGCcr 
GC3GCC0GCCT 
CGGCGCTCCQ 
AGGCAGSGQA 
TTCT-Ct5C3C5GG 
CCGACCACGA 
TGAAACGGCT 
CACTGCCOGG 
AjCtTCTCCCC 
OGATCOCn^C 



41 
I 

GCTGCiCTGG 
GTTTCGQCCQ 
AGGGTCTCOG 
AGOTGAGGCQ 
GOAACXrrCAG 
CCTGGOGCAG 
GaAa3&CGA.C 
TGAOOCTGCC 

OGAGAChCCC 

AGAOIVCCCCG 
ATOOOGIGCA 
GCCAQCACGX 
OCOCIGGCCC 




GAGGGGGDSC 
CTGCTGOGOG 
CCOGACGCGC 
GCCCTAGCAG 
CG(aGTCTA£a» 
6AC6TGGM3C 
TCGGAGGGGG 
GAGCTGCCCC 
GCGOOCCAGG 
CCCTGGGIkCC 
CCAGAGChAC 
ACAATAACAT 



.Seq ZD NO: C15& QKIA Sequeaace 
BUdeic Acid AcccsbIqh 1IM_0 02245. 
Coding sequence; 183.. 1193 



1 
1 

OOGCA5GAAG 
GOGGGOGGGA 
OGCGCTCCQG 
AGATGCTGCA 
CCCQGVGCTT 
TCTTCTCCTC 
AGOGAOBCTT 
GCOGGGTGCr 
GCSAACrOGGA 



TTOGCATTCC 
TCACCCGCAG 
CCATCQTCCA. 
OCGCTGTCTT 

AATTCAGAGA 

TGTTGGTAGT 
TCTATGIJGAA 
CdXCTCCTC 
CITTTGTGGC 

aittqttqca. 

CATTTTTATC 
AATGTCTTAT 
AGGAT6TCTA 
TCfCGACXTT 
TTTTATACTT 
ATAGGAAAAT 
QTTTATGTGT 
CAftACTCnCT 
TGTTTATATT 
ACIAThSATA 
ArrAA2U^TAAa 



11 
I 

ACGOCGCTGC 
GCCAGGCCCG 

GTCCCTGGCC 



GGlX3GhGCTG 
CTTGGAGGAG 
GGaGGCX:AGC 
CTTCWCTCC 
GGTSCOCTTG 
CTTCACCCTC 
GOO£5GTCCTC 
TOOOTTGCTC 
CTCAGXC3CXG 
GAOCAOCATT 
GCICXATAAG 
TCTGGAAACC 
GAAQGZVCAAG 
GA'XOVCAGAC 
CAaCSCAGTCSL 
TTATOCrAGA 
AG&ATGGhAA 
TAAAAAACAA. 
ATATGTGAGG 
ACAKAGOAGG 
TTAACT6GAA 
TTATATTOrAO 
ACTQOTTTGC 
AXTTAIAATQ 
CTGTACATAT 



□GGAATAATA 



21 
1 

OOGGAGGAGC 
GGCGGGGGCG 
GCGTTGGCCT 
GGCAGCTGGT 
GTGCTGeGCT 
CCCTAOTGAGG 
CAOGAGTGCC 
AACIACGGCG 
GCQCTCITCT 
TCAGATGGAG 
CrOTTCCTGA 
TACTTCCACA 
CTTGGGTTTG 
GiUSGATGACT 

ATTG0GAVCA 
TTCTGTGAAC 
GACQAGGATC 
CAGGCAGCTG 
TCTGQCIGGG 
GCACCAGGGT 
AGGGAAAATT 
CAAAAAAAGA 
AAATGAGAOXS 
AGAATACETQ 
ACTTTGOGGT 
AAGCAAAAAA 
ATGTACCCAC 
CATAGGTAAC 
G6TTTAGGTC 

AAC^QAGAG 



31 
I 

QGGGOGGGOG 
GG0GCGG0C3G 
TGGCTTTGGC 
GCGIGOGOCr 
ACTTGCTCTA 
AGCIGCTGCXS 
TGTCTGAGCA 
TGTCGGTGCT 
TCGCCAGCAC 
GXAAGGCCrr 
OQGCTGXGGr 
TGCGCTGGGG 
TCACTOTGTC 
GGAACTTCCT 
ATTATGTGCC 
CGIGTTACSCI 
TOCATGAGCT 
AGGTGCACAT 
GCATGAAAGA 
TGGATSGCCC 
CAGGGTGCftA 
ATGTC3tf!TTT 
CACATGGRAC 
TCGACCIAAA 
AAGCAGTAXO 
TTGCATTTAG 
AAAAAGCATA 
OCAAAATGAT 
CAITAACTAT 



C^TATAltaTA 
T6AA7AAGCA 



41 
I 

GGOGOGOGGG 
GQCCAGAAGA 
TTTOGCGQCG 
GGTGGAGGGG 
CCTGGTCTTC 
CCAGGAGCTG 
GCAGCTGGAG 
CAGCAACGCC 
GGTGCTCTOC 
CnnSCATCATC 
CCnSOGCATC 

crrciccAAG 

CTGGTTCXTC 
GGAAQ?CCTT:f 
TOGGGAAGGC 
GCTACTTGGC 
GAAAAAATTC 
CATAGAGCAT 
GGAOCAGAAG 
TGCAAACCAT 
GG3UUSAGGCT 
AAGAAATAGC 
AAAGAAGCTG 
JVFTCATATGT 
CTG CTglG Gr 
AICATTTAGC 
GAGATGTGXT 
TATTTTTGGA 
OTACATATAA 
AG1X3TAGTTC 

AAGAATCCAG AGTTGCTACA 



51 
I 

GGAG06GGCG 
GGOGG0GG6C 
GOGGTGQAGA 
CAiCOGClOGG 
GGOGCAGTGG 
OSCAAGCTGA 
CRGTTCCTGG 
TCGGQCRACT 
AGCACAGGTT 
TAGICCBXCA 
AC00TGCAC9G 
CAGGTGGTG6 
TTCATCCCGC3 
TAXTTTTGl'X 
TACAATCAAA 
CrTAITGGCA 
AGAAAAATGT 
GACCAACTGT 
CAAAAT6K3C 
•CGAGCIGKCAGG 
TAA GTATOT T 
TACTGTTTGC 
TGAOCOCTGC 
GACAAAATTA 



TGATGGCIAA 
TTATAAATAG 
GAATCTAAGT 
AQTATAAATA 



fieq ID NO I C1S9 DNA Sequence 
Nucleic Acid Accession «; NM_005472. 
Coding sequaoce; 93..404 ^ 



6G 

120 

180 

24D 

300 

360 

420 

480 

540 

600 

660 

720 

780 

940 

900 

960 

9fi9 



1 
I 

AAAGGGACTC 
QCQAOTCTTC 
GGTATGRGAG 
TCTGCOOGCC 
TAOCTGGCOS 
TAACTGTGG6 
AGOSCXATCA 
OQOrGGAAGA 
TGCATCAGGZ 



I 

CIIGAAACIG 
OCCCAOCTCA 
CCTGCATX3QC 
AGGGCCAGGG 
IGATGACAAC 
CM30CTCATC 
TGIGTATATC 
OOU M 3ACAOC 

cr 



21 
1 

AITGAGAGOC 
ATOCCTGTTO 
GTGCTGAAGG 
CTGGGGCX3VG 
TCCTACA!i:GT 
CTOGGATACA 
AAGAAdCCGTG 
TGGQGATTGC 



31 
I 

CAGTGGAVTT 
CTATGGAGAC 
CrCTAAATC3C 
ACAAOCAGAC 
AGATTCICTT 
CCCGCTOCOG 
XGXClATGiAX 
GTCXGGGGCX; 



41 



51 
I 



TACCAATGGA 
CACTCTTCAC 
TGAAGAGAGG 
TGTC31TGTTT 
CAAAGTGGAG 



TCCAGAACTC 



Saq XD NOs C160 DHA Sequence 

Nucleic Aeid Accession #s )IH_005345.1 

Coding sequence s 187 13 95 9 

1 11 21 31 ' 41 

I 1 I I I 

CrGGaCQGCC OGGOGGGGGS AGAGGGOGCG GGAGGOGCTrC 



AGCAAOTTGC 
OGGGCCAGCC 
CTATTTGCTG 
AAGCGTA6TG 
AGdGCTOGGA 
TGCIXSTGSAC 



51 
I 

XACCATQCGG 



60 

120 

IBQ 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1300 

1440 

1500 

1560 

1620 

1680 

1740 

laoo 

1860 
1901 



60 

120 

180 

240 

300 

360 

420 

480 

492 
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ACX30G0GAGC CCGGCGAGGC CXXI3GCAGGC CC3GTCCCTaC TCGGGGGCGC GCTGWaACQS 120 

CGGOTOAGCT CC^CGAt^AGC QCOTTCGCCA CTTOGGGCCA ACTTTGOBAT TCC06ACA6T 18D 

TAAGCAATGG <3Q3«3ACATTT GGCTTTGCTC CTGCTTCTGC TCCTTCTCTT CXMCATTTT 240 

QGAGACaGTG ATGGCA<3CCA ACORCTTGAA CRGACTCCTC TQCAOTTrAC ACACCTOGAG 300 

5 TACAACGTCA CCQTaCAGGA GftACTCTGCR GCTAAGACTT ATGTGOGGCA. TCCTGTCAA6 360 

ATOSQTQTTT ACATTACACA TCCAGCGTGG GAAGTAAQOT ACfkAAATXGT TTOCGGAiQAC 420 

AGTGRAAACC TGTTCAAAC3C TGAAGAQTAC ATTCITOGGAC ACTTTTGCTT TCTAAlSAATA 4B0 

AGGACX3^AAQ QAGGAAATAC AGCTATTCTT AATAGAGAAG TGAJVOQATCA CTACACATTG 540 

ATAGTGAAAG CACTTGAAAA AAATACTAAT QTGGAtaGCGC GAACAAAGGT CAGGGTGCAO SCO 

10 GTGCTGGATA CAAATCxACTT GAGACCSTTA TTCTCACCCA CCTC3VTACAG CGTTTCTTTA S60 

CCTGAAAACA CAGCTATAAG 6ACCAGTATC GCAAfiRGTCA GOGOOU^GGA T6CAGACATA 720 

GGAACCAACG GG3AATTTXA CTACAGITXTT AAAGATCGAA CAOATATGTT TGCTATTCAC 7B0 

CCSiACCAlGTG GTGTGATAGT GTTAACTGGT AQACTTGATT ACCIAGAGAC CAAGCTCTAT B4 0 

6AGATGGAAA TCCTOaCTGC GGACOSTGGC ATGAAGTTGT ATGGGAQCAG TOGCATCAGC 900 

15 AOCAIGGCCA AGCTAACGGT GCACATOOftA CAQOCCaUVrG AAT6TGCTOC GGTGATAACA 960 

GCAGTGACAT TGTCa^CCATC AGAACTGGAC A6GGACCCAO CATATGCAAT tGTGACAGTG 1020 

GA1GACTG0G ATCAGGGTGC CAATtSOTGAC ATAGCATCTT TAAGCATOGT GGCAOOTGAC 1080 

CTTCTCCaaC ASTTTAGAAC AGTGAGGTOC TTTCCAGGGA OTAAOGAOTA TAAAGTC3VAA 1140 

GCCATGG6TG ACATTGATTQ QQACAQTCAT CCTTTCGGCT AC&ATCTCAC ACTACAGQCT 1200 

20 AAAGATAAAG QAACTCCGCC CCAGITTCTCT TCTGTTAAAG TCATTCAOaT GftCTTCTOCa. 1260 

CAGXTGAAAS COGGGOCAGT CAAGTTTGAA AAQOATGrXT ACAGAGGAGA AATAAGTGAA 1320 

TTTGCTCCTC CCAACACACC TGXGGTCATG GTAAAGGCCA TTCCTaCTTA TTCCCATTTG 1380 

AGGTATGTTT TTAAAAGQAC ACCTGOAAAA GCTAAATTCA GTTTAAATTA CRACACTGGT 1440 

CTCATTTCTA TTTTAGAAOC AGTTAAAAGA CAGC3«3GC3«3 CXX3VTTTTGA ACTTGAAGTA 1500 

25 ACT^ACAAGTO ACAiQAAAAGC GTCCACCAAG GTCTTGGTGA AAGTCTTAGQ TGCAAATAaC 1560 

AATCCCCCK3 AATTTACCCA GACAGCGTAC AAAQOTGCTT TTGAIGAfiAA OGT6C0CATT 162G 

GGTACTACTA TCRTGAGCCT GAGTGCCGTA GACCCTGATG AGGGTGRGRA TGGGTACGTG 1680 

ACATACAGTA TOGCRAATTT AAATCATQTG CCGTTTGCGA TTGAOCATTT CACTQGTGCC 1740 

OTOAiOTACGT CAGAAAACCX GGACTACGAA CTGATGCXTC GGOTTTATAC TCTGAGGATT 1800 

30 CGTGCATCAO AC3PGGGGC?CT GCCQTAOCXSC CGQ6AAGTC6 AAGTOCTTCC TACBATTACT 1860 

CTCAATAACX TGAATGACAA CACACCTTTG TTTGAGAAAA TAAATTGiTGA AGGSACRATT 1930 

OCCAGAGATC TAGOGGTOOG AGAGCAAATA ACCACTGTTT CTGCTATT0A TGCASATQAA 19 BO 

CTTCAOTTGG TACAGTAXCA GATTGAAGCT GGAAATGAAC TOGATTTGrT TAGTTTAAAC 2040 

CCCAACTOGG GGGTATTGTC ATTAAAGCX3A XCGCXAATGG ATGGCTTAGG TGCAAAGGTG 2100 

35 TCTTTCCACA GTCIGAGAAT CACAGCTACA GATGOAQAAA ATTTTGCX2AC ACCATTATAT 2160 

ATCAACATAA CAGTGGCTGC CSWSTCaUIAAG CIGGTAAACT TGCAGTQTQA AGAGACTQQT 2220 

GTTOCCAAAA TGCIGGCAGA GAAGCTCCTG CAGGCAAAXA AATOACSAiCAA CCM9GGAGAG 2280 

GTGGAGGATA TTTTCTTCSaA TTCTCACTCT GTCAATGCTC ACATACOGCA GTTTAGAAaC 2340 

ACTCTTCX3GA CTGGTATTCA GGTAAAGGAA AACCAGCCTQ TGGSTTCCSyS TGTAATTTTC 2400 

40 ATGAACrCCA CTGACCTTTGA CAH^TOGCITC AA1GGAAAAC TGCTCTATGC TGZTTC^aGA 2460 

GGAAATGAGG ATAGTTGGTT CSKEGATTOAT ATGGAAACAG GAATGCa^AA AAITTTATCT 2530 

OCrCTTGACC GTGAAACSkAC AGAC3AATAC AIXXTTGAATA TTACOGTCTA TQACCTTGGG 25B0 

ATACCCCAGA AQGCTOOGTG 606TCITCTA CATGTCQTSQ TTGTC23ATGC CAATGATAAT 2640 

CCACCXaJAQT TTTTACAGGA GABCXATTIT GTGGAAGTGA GTGAAGACAA GGBGQTACAT 2700 

45 AaTOAAATCA TCC2AG61:TGA AGGCACAGAT AAAGACCTGG GGCCCAACOQ ACACGorGAGS 2760 

TACTCAATTC TTAGAGACAC AQACACATTT TCAATTGACSL G0GTGACGG6 TGTTGTTAAC 2820 

ATCGCACGCX? CXCrGGAXGG AGASCIGCAG CATGAGCACT CCTTAAaiBAT TGAOGGCAGG 2BB0 

GACCAAGOCA GAGAAjaAGCC TCAGCTOTTC "XCCKCCGTCG TTGTGAAAGT ATCACTAQAA 2940 
GATtarlAATG ACAACCCACC TACATTTATT OCAOCTAATT ATCQTQTGAA AGTCX^GAGAG 3000 
50 GATCTTCCAG AAGOAACCSGr CATCATQTOG TTAGAAGOCX: AGGATCCTGA TTTAaOTCAG 3060 
TCIGGTCAGG T6ACATACAG GCTTCTGGAC CACGOAGAAS OAAACITGGA TOTGGATAAA 3120 
CTCAGTOGAG CAGTTAGGAT CGTGCAGCAG TTGGACTTTG AGAA0AAGCA AGTOTATAAT 3180 
CTCRCTGTGA GGGCCAAAOA CAAGGGAAAG CCAOTTTCTC rMTCTPCTAC TTGCTATGTT 3240 
GAAGTTOAGG IGGTOTGATGT GAATGAGAAC CTGCACGCAC OOOTOTTTTC CAGCTIIGIG 3300 
55 GAAAAGGGGA CA£3TOAAAQA AOATaCACCT CSXTGGTTCAV TGGTAATGAC OGTGTOGGCT 3360 
CAtTGATOAOG AOGCCGGAAG AGATGGGGAG ATCGGATACT CC31TTA0AGA TOSCTCTGGC 3420 
GTTGGTGTTT 7CAAAATAGG TGAAQAOACA CQTCTCATAG ACACGTCACA TOSACTGGAC 3480 
CQTGAATCGA OCTCCGATTA TTGGCTAACA GTCTTTGCAA CCGATCAGGO TQTOGIGCCT 3540 
^ CTTTCATOQT TCATAOAGAT CTACATAGAG GTTGAGGATG TGAATGACAA TGCACXS^CAQ 3600 
60 ACATGAGAGC CTGTTTATTA CCCAGAAATC ATGGAAAAXT €nx:C£lAAAGA TGTATCTGTG 3660 

oTccauawicsG AfiGcsunniGA vcx^gattos agctciaatg acaagctcat otacaaaatt 3720 

ACAAGTGGAA ATOCACAAQG ATTCTTTTCn ATACATCCTA AAACAiSSTCT CATCftCAACT 3780 
AOSTCSUUSGA AGCTAGACOS AGAACAGCAA GATGAACACA TATTAGAGOr TACTGXGACA 3840 
6ACAATGGTA GTCCCCCX^AA ATCAA£SC3LTT GCAAGAGICA TTGTGAAAAT CCTTGATOAA 3900 

65 AATGACAACA AACCTCAGTT TCTGCAAAAG TTCTACAAAA TCAGACrOCX: IGAGOSGGAA 3960 
AA8CCA0ACC GAQAAAQAAA TdCCAGACGG 6AGGOGCTCT ATGGOGTCAT AQCCACCOAC 4020 
AAGGAIGAOG GOCOCAATCC AGAAATCTCC TACAGCATOQ AAGAOSGGAA TGAGCATGGC 4080 
AAATTTTTCA TOGAACCBAA AACtQGAGTG GTTTG6TCCR AGAGGTTTTC AGCA^TGGA 4140 
GAATATGATA TTCTTTCAAT TAAGGCAGXT QACAATGGTC GCCCTCAAAA GTCATGRAOC 4200 

70 ACCAGACXGC ATATTGAATG GATCXCCAAG CCCAAACAGT COCIGGA60C CATTTCATTT 4260 
GAAGAATCSUr TTTTTACCIT TACTGTGATG GAAA0IGACC CC3GTTaCTC3k CATQAXTGOA 4320 
GTAATATCTG TGGAGOCTCC TGGCATACCC ClTTGGTITG ACATCACTOG TGGCAACTAC 4380 
6ACAOTCACT TCGATGTGGA CAAGGGAACT GGAACCATCA TTGTTGCCAA ACCTCTTGAT 4440 

_^ GCAGAACAGA AGTCAAACTA CAACCTCACA GTGGAGGCTA CAGATGGAAC CACCACTATC 4500 

75 CTCACTCAGS TATXCAXCAA AGTAATAGAC ACAAATGACC ATCGlCCTCA GTTTlCTACA 4560 
TCAAAGTATG AAGTTOTTAT TCCTGAAGAT ACAGGGGCAG AAACAGAAAT TTTGCAAATC 4620 
AGTGCTGTGG ATCAGGATGA GAAAAACAAA CTAA1CTACA CICIGCAGAG CAGTAGAGAT 4680 
GCACPGAGTC TCAAQAAATT TaaTCTTGAT CCTGCAACOG GCTCTCXCTA TACTTCTOAG 4740 
AAACTGGATC ATGAAGCTQT TTCACCAGCA CACOTCACGG l'CA3^GGTAGG AGATCAAGAT 4800 

80 GTGCCTGIAA AACXSCAACTT TGCAa^GGATT GlTGGTCAATiG TCAGCXSACAC GAATQACQ^C 4860 
GOCGOGTOGT TCAOOQCTTC CICCIACAAA GGGCGGGTTT ATGAATGGGC AGCOGTTGGC 4930 
iCMHTQTGrt TGCAGSXGAC GGCTCIGGAC AAGGACAAAQ GQAAAA ATOC TGAAGXeClG 4980 
TACTOGAlCa AGTCTG(&\AA TATTGGAAAT A^GGAAATT CTTTTATGAT T6ATCCT0TC 5040 
■raGGGCTCIA TTAAAACIGC CAAAGAATTA GATOBAAGIA AOCAAGOQQA C313^TGATTTA 5100 
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ATGGTAAAA6 CTACAGATAA GGGCAGTCCA CCAATGAGTG AAATAACTTC TGTGCGTATC S1«0 

TTTGXCACAA TXGCTGACAA CGCCTCTCCG AAQTTTACAT CAAAAGAATA TTCTCSTTGAA 5220 

CTTAOTOAAA. CTGTOiGCAT TGCGACSTTTC GTTGGGAT<3C3 TTACRGCCCA TAGTCRATCA 52 BO 

TCAGTGGTGT ATGAAATAAA AGATGOAAAT ACAG6TGATG CTTTTGATAT TAATCCACRT 5340 

TClXsGAACTA TCATCACTCA GAAAGCCCTG GACTTTGAAA CTTTGCCCAT TTACACATTG 5400 

ATAATAC7AG GAACTAACAT GGCTBQTTTQ TCCACTAATA CAACGGTTCT AlCXTTCACTTG 5460 

CAGGATGRGA ATOACAACGC GCCAGTTTTT ATGC3W3QCAO AATATACAG6 ACTCATTAGT 552 O 

GAATCAGCXrr CAATTAACAQ CaTQQTCJCrA ACAGACAOGA ATGTCCCACT GGTGATTCC2A 5580 

QCAQCTQATG CTQATAAAGA CTCAAATGCT TTGCTTGTAT ATCACATTGT T6AACCATCT 5640 

GTACACACAT ATTTTGCTAT TGATTCTAGC ACTGGTGCTA TTCATACAGT ACTAAGTCTTG S700 

GACTATGAAO AAACAAGTAT TTTrCACTTT ACCGTCCAAG TGCATQACRT GOBAACCCCA 57 6 Q 

CX5TTTATTTS CTGAGTATGC AGCGAATGTA ACAGTACATG TAATTGACAT TAAOXSACTGC 5a20 

CCCCCTQTOT TTGCC3U\GCC ATTATATGAA GCATCTCrOT TGTTACCAAC ATACAAAGGA 5880 

GTAAAAGTCA TCACRGTAAA TGCTACAQAT OCTQATTCAA GIGCATTCTC AGAGTTGATT 5940 

TACTGCATCA CCG7>kAGGCAA CATCGGGGAG AAGTTTTCTA TQGACTACAA GACTGGTGCT 6000 

CTCACTGTCC AAAACACAAC TCAGTTAAGA AGCCGCTACS AGCIAACGGT TAGAGCTTCC 6050 

GATOGCaMaAT TTGCCGGCCr XACCTCTGTC ARAATTAATG TQAAAOAAAG CAAAGAAA6T 6120 

C3U:CTAAA6T TTACCCIAGGA TOTCTACTCT aCGGTAG:rGA AAGAGAATTC CACCGAGGCX: 61 BO 

GAAACATZAG CTGlTCATTAC TGCTATTGSG ASTCCAATCA ATOAGCCTTT GTrTTATCAC 6240 

ATCCICAACC CAGATCGCAG ATTTAAAATA AGCCBCACTT CRj^SGGTTCT GTCAACCACT 6300 

GQCACaCCCT TOGATCGrGA GCAGCAGGAG GCGTTTGATG TGC3TTGTAGA AGTGRTAGAG 6360 

OAACATAAGC CTTCTGCAGT GQCXXaiDQTT GTCGTGAAGG TCATTGTAGA AGACCAAAAT 6420 

OATIUVTGCQC CGGTGTTTGr CSVACCTTCCC TACTAaSCOT TTGTTAAAGT GGACACTGAG 6480 

GTGGGCCATG TCATTCX^CrTA TQTCACTGCT <>rAGACAGAG ACAGTOGCAG AAACGQGGAA 6540 

GTQCArrrflCT ACCTCAAQGA ACATCATGAA CaWTTTTCAAA TTGQACC!CTT GGGTGAAATT 6600 

TCACTGAAAA AGCAATTTOA OCTTGACACG TTAAATAAAG AATATCTTGT TACACJrGQTT 6660 

GC3AAAGATG GAGQCSAACCC GGCCTTTTCA GOQOAAfiTTA TCGTTOOGAT CACT6TCATG 6720 

AATAAAQCCA TGCCIGTGTT TXSAAAAACCT TTCTACAGTG CACaAOATTOC AGAGAGCATC 6780 

CAG6T6CACA GCXX:TaTGGT CCZACGTGCAG GCXAACAGOC COGAAGGCCT GAAAGTGTTC 6840 

TACAGCATCA CAGAOGGAQA O0CTTTC7VGC CAOTTCACTA TTAAC^TCAA TACIGGAGTT 6500 

ATCAATGTCA TAGCTCCTCT OQACTTTGAS GCCCACCCX3G CATATAAGCT GAGCATACGC 6960 

GCAACTGACI CCTTGACG66 OGCTCATQCT OAAGTATTTQ TGGA£:atCAT AGTAiGAOGAC 7020 

ATCAATGATA ACCCTCCTCST OTTTGCTCflfi CAGrCXTATG OGGTGACOCT GTCTGAGGCA 7000 

TCTGTAATTG GAACGTCTGT T6TTCAAGTT AGAGCCAOCQ ATTCTGATTC AGAACCAAAT 7140 

AGAGGftATCT CATACCAOAT GTTTGQGAAT CACAGCAAGA GTCATGATCA TTTTCATGTA 7200 

GACAGCAfiCA CIGGOCTCAT CTCACTACTC AGAACCCTGQ ATTACOAGCA 6TCCGGGCAG 7260 

CACAOQATTT TTOTGAOGQC ASTTOATGGT GGTATGCCCA OGCTGAGCAG TOATGTGA17 7320 

GTC!ACGG1X3G AC6TTAOOGA CCTCAATGGT AATCCACCAC TCTTTGAACA ACAGATTTAT 73S0 

GAAOCCRGAA TTAGCGAGCA CQCCCCTCAT GGGCATTTOG TCACCTGTGT AAAAGCCTAT 7440 

GATGCAGACA GTTCAGRCAT AGACAAGTTG CAGTATTCCA TTCTGTClGG GAArTGATCAT 7500 

AAACATTITG TCATTOACAiQ TGCAACAGGG ATTATCACCC TCTCAAACCT GCACOGOCAC 7560 

GCCCTGAAGC CATTTTAGAG TCTTAACCTG TCAGTGTCTO ATGGAGTTTT TAGAAGTTOC 7620 

AOOCAGGTTC ATGTAACTGT AATTQGROGC AATTTGCACA GTCCTQCTTT CCTTCAGAAC 7680 

GAATATGAAG TGGAACTAGC TGAAAAGGCT COCCTACATA COCTOGTOAT OGAGGTGAAA 7740 

ACTAOQGATG GGGATTCTOG TATTTATQGT OUOaTTACTT AGCAlATIGI AAA1GACTTT 7800 

OCCAAAGACA GATTTTACa^T AAATGAGACA GGACAGATAT TTACTTTGGA AAAACnTGAX 786 D 

OGAGAAAOOC CGGCGGAGAA Af7IY3ATCTCA GTCOGTTTAA rrGGCTAAGGA TGCTG6AC3GA 7920 

AAAGTTQCTT TJCTGCAfXGT GAATGTCATC CTTACAGATG ACAATQACAA TQCACCACAA 7980 

TTTaaAGCAA C3CAAATAOaA AOTOAATATC GGGTCCAGTO CTGCTAARGG GACTTCAOTC 8040 

GTAAAGTCI6 CAAGTGATGC CX3ATGAGGGC TCCPA'XGCOa ACATCA£!CTA TGCCAtTGAA 8100 

GCAGACTCTG AATUSTGTAAA AGAGAATTTG GAAATTAACA AACTGTOOGG OSTAATCRCr 8160 

ACAAAGGA6A GCCTCATTGG CTTGGAAAAT GAATTCTTC3L CCrTCTl'SGT VrAGAGCTGTG 8220 
BATAATGQCJr CTCCATCAAA AGAATCTGTT GTTCTTGTCT ATGTTAAAAT CCTTCCAiCCO 6280 
GAAATGCAGC TTOCAAAATT TTCAGAACCT TTCTATAOCT TTACAGTGTC AGAGGACGTG 8340 

CCT01^TGGAA. CAGAGArrAGA TCTCATOOGA GCAGAACATA 6TGGGACTOT TCTTTACAGC 8400 

CTOGTCAAAO OOAATACTCC AGAAAGCAAT AGGGATGAGT CCTTTGTGAT TGACAGACAG 8460 
AGCQGOAOAC TGAAGTTGGA G2^6AGTCIT GA7C?^TGAG2^ CA2^AAiGT0 OTATCAQTTT 8520 
TCCATACTGG CCAiGOTGCAC TCSUWaATOAC CATGAGATGG TGGCTTCTGT AGATGTTAST 8580 
ATOC!AA6TGA AAGATGCAAA TGACAAC3V5C COGGTCTTTG AATCIAGTCC ATATGAGSCA 8640 
TTCATTGTTG AAAACCTGCC AGGGGGAAST AGAGTAAITC ASATCAGG6C ATCTQATOCT 8700 

^u:tcagga& ccaaoggcca agttatqtat AacxrraGATc agtcrcaaag tgxqgaagtc 8760 

ATTGAATCCr TTGCCaVFTAA C3VTGGAAAC!A GGCTGGATTA CAACTTTAAA GGAACTTOAC 8820 
CATGAAAAGA OAGACAATTA CCAGATTAAA GIGGTT6CAT CAGAVOVrGG TGAAAAGATC 8880 
C!AGCTATCCrr CCACAGCCAT TGTGGATGTT ACCGTCACG3 ATGTCAACSQA TAGTOCACCA 8940 
GGATTCA09Q CO0AGA7CTA TAAAOGGACT GTGAjSTGAOG ATCAOCXSCCA AGGTGGGOTG 9000 
ATEGOCATCT TAAGTACCAC GGATGCZGAT TCTGA2)iG»GA TCAACMOAlCA AGVTACKTKr 9060 
TTCATAACAQ Q2V5GGQATCC TTTRGGACAjG TTlIQCOGTTG AAACTATACA GftATGAATGO 9120 
AAGGZATATG TGAAGAAACC TCTAGACAGG GAAAAAAGGG ACAATTAC!CT TCrTACTATC 9180 
AOQGCAACTG ATGGCACCirr CICA.TCAAAA GCGATAGTFG AAGTGAAAGT TCTGOATOCA 9240 
AA3GACAACA GTCCAOTTia TGAAAAGACT TXATATTCAG ACACTATTCX: TGAAGACGTC 9300 
CITCCIGG3A AATTGATCAT GCAQATCrCT GCTACSUSACQ CAGACATCCSS CTCTAAOdCT 9360 
GAAATTACTT ACACGTTATT GGGTTGAGGT GCAGAAAAAT TCZOU^TAAA TCCAOACACA 9420 
GGTGAACTGA AAAOSTCAAC CCCCt^TTGAT OGTOAISGAGC AAGCTGTTTA TCATCTTCTC 9480 
GTCAGGGCCA CAGATGGAGG AGGAAGATTC IGOCAAGCCA GTATTGTCGT CACGCXAGAA 9540 
GATGraAOG ATAAOGCOCC OGAAOrTCTCT GCCQATCCTT ATGCCATCAC OGTGTTTGAA 9600 
AACACAGAGC CGGGAAOGCT GCTOACAftGA 6TGCACGOCA CAGATBOCGA GGCAOGATTA 9660 
AATOGGAAGA TTTTATACTC ACTGATEGAC TCTGCTSATG GGCAGTTCTC CATTAAOGAA 9720 
TTATCTGGAA TTATTCAQTT AGAAAAACCT TTGGACAGAG AACTCCAGGC AGTATACACC 9780 
CTCTCTTTGA AAGCTGTGGA TCAAGGCTTG CCAAQGAGGC TGACTGCCAC TGGCACTGTG 9840 
ATTGTATCAS TTCTTGACAT A AATG ACAAC CCOCCTGTOr TTGAGTAGCXS TOAATATGQT 9900 
GCCACOOnOT CTGAGGACAT TCITGTTGGA ACFGAAGTTC TTCAAOTGriA TGCAGCAAGT 9960 
CGGGAXKIIG AAGCAAATGC AGAAATCACC TACTCAATAA TAAGIGGAAA TGAACATOGG 10030 
AAAITCAGCA TAGATTCTAA AACAGGGGOC GTATTTATCA nTQAOAAXCT GGATTATGAO lOOSO 
AGCrClCATG AGTATTACCT AACAG1»GAG GCCACTGATG GAGGCAOGCC TTCSUZTSUSC 10140 
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GACGTTGCCA. CTGTGAACGT TAATGTAACA GATATCAACX3 ATAATACCCC TQTGTTCAGC 10200 

CftAGACACCT ACAOGACAGT CATCAGTGAA GATGCCGTTC TTGAGCAGTC TGTCATCRCG 10260 

CTTATGGOCG ATQATGC0C5A TGGACCTTCC AACAGCCACA TCCACTACTC AATTATAfiAT 1Q32D 

GSCAACCAACS GAAGCTCBTT CMIAATTGAC CCC6TCAGGQ OAtSAAGTCAA AGTGACCAAA 10380 

5 CTTCTCGACC GAQAAACGAT TTCAG6TTAC ACGCTCACXSG TTCRAGCTTC TGATAATGGC 10440 

AC3TCCACCCA GaGTCAACAC GACGACCGTG AACATCGATG TOTCCGATGT CAATQACAAC 10500 

GCGCCCaTCT TCTCCftOGGG AAACTACAST GTC3CTTAXCC AGGAAAATAA GCC3VSTGGGC 10560 

TTChGCGTGC TSCaGCTtSOT ACSTAACAGAT GABGATTCTI OOCATAAiCGG TGCftCGCTTC 10620 

TTCTTTACTA TTGTAACK3G AAATGATGAG AACGCTTTTG AAGTTAACCC GCAAGGAGl^ 10680 

10 CTCCTGACAT CS^TCTGOCAT CAAOAQQAAG QAGAAAGATC ATTACTTACT GCAG6TGAAG 10740 

GTQGCAGATA ATGQAARGCC "TCAGTrGTCA TCTTTOACAT ACATTGACAT TAC3GQTAATT 108Q0 

GAlQGAGAiQCA TCTATOCGOC TOOOATTTTG CCCCtGGAGA TTTTCATCAC CTCTTCTGGA 10860 

GAAGAATACT CAfaOTGGOGT CATTGGSAAG ATCCATGCCA CAOftCCAGGA OGTGTAZGAT 10920 

ACICTAACCr ACAGTCrOCSA CCCTCAOAT6 GAClU\GCT6T TCTCTOTTTC CAGCACftGGG 10980 

15 GOCAAGCTGA TAGCACACAA AAAGCTAGAC ATAGGQCAAT ACCTTCTCAA T<3TCA(3GGa'A 11040 

ACAGATOGGA AGTTCAOGAC QQTGOCCGAC ATCACAGTGC ATATCAQACR AGTCACACAG lllOO 

QAQAaxaTTGA ACCACACCAT OGCGATCOGC TTTQCCHACC ICACTCOGGA AQAATTOGIT 11160 

GGIGACTACI GGCGCWJST CCAQQGAGCT T3A06GAACA TCCTOaSiei: GAGGAGGAAC 11220 

GACATACAGA TTGTTAGTTT GCAGTCCTCT OAACCTCACC CACATCTOGA CGTCITACIX LI28Q 

20 TTTGTAGftGA AAOCAGGTAG TGCTCAGATC TCAACAAAAC AACTTCTaCA CftAGATTAAC 11340 

TCTTCOGTGA CTQACATTGA GGAAATCATT GGAGTTAGGA TAC1X3AATCT ATTCCMAAA 11400 

CTCTGCGC?QG GACTGGACTQ CCCCTGGARG TXqTGCGMG ARAAOQTQTC TGTGGATGAA 11460 

AGTGTQATBT CAACAOUIAG CACAGGCM3A CTIQAGTTTIG TGACTCCCOS CCACCACAGG 11520 

GCnSOGGTGT GTC7TCTaC3U\ AGAGGGAAGG TGCCCACCTQ TCCACQCTGG CTGTGAAGAT 11580 

23 GATCC6TGCC CTGAGGGATC CGAATGTGTQ TCTGATCCCT OGGAGGAQAA ACACACCTGT 11640 

GTCTGTCCCA tJCXSGCaVGGTT TGGTCAGTGC CCAGGQAaTT CATCTATGAC ACTGACTQGA 11700 

AACAGCTACG TGAAATACOG TCTQACBQAA AATGAAAACA AATTASAQAT GAAACIGACC 117E0 

AXGAGGCrCA 6AACATATTC CAGGCATGCG aTTOTCAiaT AieCIOGAGG AACTC3ACTAT 11820 

AGQ^l-CrTGG AGATTCATCZA TGGAAfSGCVG CAGTACAAGT TTGACZCTGG AA6TOC3CCCT 11B80 

30 GGAATTQTCT CTGTTCAGAG CATTCftGOTC AATGATGGGC AGTOGCRCQC AOTOGCJCCXG 11940 

GRA6TGARTG GAAACTATQC TCGCrTGGTT CTAGAOCAAG TTCATACTGC ATC3GGGCACA 12000 

CKDCaZAGQQA CiClGAAAAC CCIQAACCTG OATAACtATG TGTTTTTTGG TGGC5CRCATC 12060 

CeiCAGCAGG GAACAAGGCA TGGAAGAAGT CCIGAAGTTC GTAATOGm CAGGGGTTGT 12120 

ATGGACTCCA TTTATtTGAA rFGGGCAQOAa CTCXXnTTEAA ACM3CAAACC CAGAAOCTAT 12180 

35 GCACACarOG AAGAGTCaaor QGarGTATCT CCMSGCTGCV TCX^ACGGC C&OGGRAGAC 12240 

TGCGOCRGCA ACCCTIGOCA GAATGGAGGC GTTTQCSUVrC OSTCACCTGC TaOAQOTTAT 12300 

TACTGCAAAT 6CAI3TGCCTT GTACATAGGG ACCCACTGTG AQATAAGGGT CAAT00GI(?3' 12360 

TCSCTCCAACC CAXGCCTGIA TOQGGGCSVCQ TOTOTTGrOQ ACAACGGAGG CTTTGTTTGC 12420 

€3U3!rGTaX3A6 GATTATATAC TGGTCOGAGG TGTCAGCTTA QTtXJmOtG CAAAGATGAA 12480 

40 CXICTGTAAGA ATOGOOGAAC ATGCTTTGAC AOTTTGCA-rG GOGCCX3TTTG TCAOTGTIGAT 12540 

TOGGGTTTTA GGGGAQAAAO QTQTCAGAfiT GATATOSAOG ACTTGCTCrGG AAACCCTTGC 12600 

CrGC&£3aSGG COCTCTGIGA GAACAOGCAC GGCTCCTATC ACTGCAACTG CAQCCAOGAS 12660 

TAaUSOGOAC GTCACTGOQA GQATGCTGOG CCCAACCAGT ATGTGTCCAC! GGOGIGGAAC 12720 

ATTGaaTTOG OGGAAG6AAT TGGAATGGTT GTGTTTOTTG CAGOGKTATT TTTACTGQIG 12780 

4S GTGGTGTTTG TTCTCTGOCG TAAQATGAT? AG^CGGAAAA AGAAGCATCA GGCTG^ACCT 12 840 

AAAGACAAGC ACCTCSGGACC CGCTAOGGCT TTCTTGCAAA QACXGTATTT TGATTCCAAQ 12900 

CTAAATAAGA ACATTTACTC AQACATACCS^ CCCCAGGTGC CTGTCCaaOC TATOTTCCTAC 12960 

AOOCCOaOTA TTCSCAASTSA CTCAAGAAAC AATCIGGACX! GAAATTOCTT CGAAOQATCT 13020 

GCXATOCCAG AGCATCCOQA ATTCAfSCAClT VTVAACCCCG AGTdXTTOCA OGOGCAOCGA 13080 

50 AAASCAIGTGG CGGXCTGCAG OGTGGOSCCA AACCTQCCTC CCOCaCOOOC TTCAAACTCJC 13140 

CCTTCPGACA GCSHVCTCCAT CCAtSAAGCXlIT AGCTGGGACT TK3ACTA3t3A CACAAAAGTO 13200 

GTOGATCmG ATCC3CTGTCT TTOCAA8AAQ CCTCTASAGG AAAAGCCTTC CCAOGCS^l^ 13260 

AGTGCOCGOG AAAGCCTGlC irGZUUKrGGAG TCOCTGAGCT CCTTCC!AGTC QGAATOGTGC 13320 

GAT6ACAATS GGTATCACTG GOATACATCA GATTGGATGC CAAGOGTTCC TCTGCOGGAC 13380 

55 ATACAASAGT TCCCCAACTA T6AGGT6ATT GATGAGCAQA CkCCCCIGTA CICASCAGAT 13440 

CCAAACGCCA TOGATACOGA. CTATTACOCT GSAGGCTAOG ACATOSAAAQ TOATTTTOCT 13500 

CCACCCCCAO AAGACTTCCC CGCAGCTGAT GAGCTACCAC CQTTACXX3CC GGAATTCAGC 13560 

AATCAGTTTQ AAT0CMK3CA GCCTC3CTA6A GACAIGCCTG CC6CX960TAG CTTGGSTTCT 13620 

TCarCRABAA ACGSGCASA6 GTTCAACTTB AATCAGTATT TGCCCaWETT TTATCCCCTC 13680 

60 GATATGTCTG AAC3CTCAAAC AAAAGGCACT GGTGAGAATA GTACITGTAG AGAAOCCCA7 13740 

QCCCCTTACC GGCCAGGGTA TCAAAGACAC TTCQRGCSCGC OOGCXGTOGA GAGCMGCCC 13800 

ATGTCTGTOT" AC3QCCTCC3«3 OGCCXCCTGC TCTO^OOTOT CAQCX^TGCIS OGAACTGQAQ 13860 

TCXaSAGGTCA TGATGAGTGA CTATGftnAJSC CSGSCSACJSZiaS GOCACITCQA AQAGGTGftOG 13920 

ATCOCGCX3CC TGOATTCGCA QCAGCACACG GAAQTCTGAG TCTCRACTC3C COCCAAftOTG 13980 

65 OCTGACTTTA GTGAA<XrrAa AGGTGA.TGTG AGTAA^TGGGC GCTOTTCTTT GCAGCAGTGC 14040 

TTCCAAQCTT TTTTOXSGTGA GCOGAATGOT CATOGCTGCG COJGGATCCTG OGtXZTCTGGA 14100 

CBTGCTAjGCC ATTTCCAGTG TCCCAACTAC TGTCATCGTC AGGTTTTCAT C3GQCI6TGCC 14160 

ATTTOOCAAC GTCTTTTGG6 ATTTACA7CT OTCTGTGVTA AAATAATCAA AOQAAAAATC 14220 

AGT OC TGTGT TOTCKOCKIG ATMVTGTAT TtATATASAT TTGATTATTT TAATTTTCCT 14280 

70 GTCTCTTTTT TTTGTAAATT TTATGTACAG ATTTORTrrX TCATAGTTTr AACTAGATTT 14340 

CCAAfiAO^ATT TTGTGCATTT GTTTCAACTG AATTTTGGTG GTGTCAGTaC CKTOOCSKa 14400 

CAOCCTGATT TTTTTTTTTT TACTATAACC AGGOTTTCAT TCTGTCrm TCCACTOAAS 14460 

TGTGACATTT TGTTAGTACA TTTC»GTGTA GTCATTCATT TCTAGCTGTA CATAEGATGA 14520 

AGOAGAGATC AJaATACATGA ACATGTCTTA CATGGGTTGC TGTATrTAGA ATTATAAACA 14SB0 

75 TTTTTCATTA TTGGAAAGTG TAACOGSGAC CTTCTOCATA CCTGTTTAOA ACCAAAACCA 14640 

CCATG&CACA GrrTTTATAG TGTCTGTATA TTTGTGA!EGC AATGSTCTTG TAAAG5TIT7 14700 

TAATGAAAAC TACCATTAGC CAGTCTTTCT TACTGACAAT AAATTATTAA TAAAAT 147S6 
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GTGOTGTTTG 
CTGAAGACAA 
CACOAXGCAT 
TGCITTRCTT 
TGTCJGTTCTT 
TCATTGGGCT 
QAT{3TGCC?AT 
6T6TCATTGT 
AGT66AACTA 

TGGCTCTTGG 
QAiOQCATATQ 
OCAGGAObGA 
ATTTGTAAAA 
TAAAGACTGG 
aTTTATTTOT 
TTACAGACTG 
JUUkTAACftTT 
AGTTGCTTTT 

AACTGCCTTG 
TSTTCCAATC 

ATCKTGTGTT 
AATAAA6CAQ 
AGGAAATAGC 
AAAAAAAAAA 



CTTTCTCCAC 
AGAGAAGGGO 
CGGACATTCX 
TCOCAATGGO 
TTCTGGCATC 

GCTTTCTTCT 
GGCAGCOCXT 
CaCCTTTGCC 
TGRACCCRAQ 
TGGAATTGAA 
TBGCTTTTGC 
GCGA<%ATCT 



CATCTTCACA 
TTTTGTTTTT 
AGTGACAGTA 
GCAAZCACTft 
TAIAAGACCA 
ATQATOTTTC 
TGTTCTGTGA 

TTTTTCTAAC 
TAAAAl^AAGA 
TTGATiZAaCR 
CTCAAAACTA 
AAAAAAAAAA 



CAGAAGGOCA 
GIAGAAAACCT 
CTOGiTaaGGC 
GAAACAAAGT 
OTAGGJUSGTS 
GACTGCTGTG 
GTATT<3GCTG 
G6CITAGCA6 
AGCACOGAGG 

TTCATCTTGT 
TC3CTCrCACC 
TCCTCTATTT 
GTGTAACATA 
GGATGTCAGT 
TTTTAA^SGAA 
CTCAC3TATAT 
T3GXATATAT 
aGAAGSAGAA 
CCATTCATAC 
GAAACAAATA 
CATCACAACT 
ATATGGAAAQ 
A2USGCTAGGA 
TCATIGOAAC 
ACTTGTTTAC 
AAA 



CACTTTCATC 
AGCAGACCAC 
TGGCGCTCXZT 
AT6CCTC0GA 
GCCTGCTGAT 
GCTGCTGTQG 
CTCTCATTGQ 
AAOGACCACX 
GCCAGTACCr 
AATCGAATGT 
GTCTTATTCA 
AACACCAATA 
CAirGTAATT 
CTCCICGACAS 
GTrrAAATTT 
TGAGGAAACA 
CIXSAOATAAA 
GXGCATGTAr 
AATCOQACAA 
ACCIATAAAT 
TTTACTTAGA 
TACAA-IGCT6 

TGACTGGGCA 



TAATTTGGQG 
CAIGTGCTAT 
OTGCATCSGGr 
AAAOCACCTC 
DCTCCTGCCA. 
CCATGAAAAC 
AATTGCRC5GA 
AT6TCXTGAT 
TCTOCJATACC 
ATCTCPGTTT 
AGTAATAAAT 
TGACTGCTAA 
TATATATTTC 
TCIACITTTA 
AGTAAACTTC 
AACCACCCTC 
CTCTATAATB 
ITTTTAAATT 

cxrraaAAAOA 

CTCTAACAAG 
GTGGAAC3GAC 
CTCSVTtGTTG 
TOCSUUUaATQ 



AACAAAATAA AGXATTCACT 



TATGACTGAG 
GGGAAGTOTG 
GCTAATATTT 
AGCCGCTTGG 
GCATTTGTCT 

TGrroGouukc 
TCTGGCTACT 



TCCACATOGT 
TCTATCCTCT 
GGAQTaCTTG 
AAGAACCAAC 
ACTTQTATTC 
CAAAOGCCIG 
TTTOTTQTTT 
TGGGGGTAGr 
TTITGOATAA 
AAftS ATGTCT 
TTTTTSTTTT 
AGGCCCTTTG 
TOATTOAGAA 
TGAGTACTAT 
AQTACTASGG 
TOQ6AAACTO 
eAGG&GCAOG 
AOQAAAAAAA 
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1 11 21 31 41 51 

111)1] 

OTTCTTTGTG ACACATCACA CAfiAArTGGA GTGCXGTCCr TCTGGRG&GT GGTGGRGRAC 60 

CAAGAOACAG TTCAG3W«:CA AAGGAATAiSA CaAAGGQCTTT OATTTCTTTT TOaClTTAGA 120 

TTGCHSGATTT OGQAGGCITA GCAGGAAAGA TGTCCftCXGA AAATGIGGAA GGGRR GCCCA ISO 

GIAAOCTTGQ GGA6AGAGGA AGAOCCXSSQA GCTCCACTTT OCTGAaGGTT OTOCaiQCCAA 340 

T(3TTTAACCA OU3TA*£TTTC ACnCTGCAG TCXCTCCTGC TGCAGftAOGC ATGOGATTCA 30O 

TCTTSG6AGA GGAGtSATGAC A«3CCCAi?CrC CCCCTCAGCT CTTCACOGAA CIOGATGflGC 360 

TGCTGGCCGT GGATGGGCAG GAGATGGAGT GGAAGGAAAC AGOCAGGTGG ATCAAGTTTG 42D 

AAGAAAAA6T GGAACAGGGT GGGGAAAGAT GGAGCAA<3CC CCATOTOGCC ACKITGTCCC 480 

TTCATMSTTT AXXllGASClG A6GACATGXA TCSQAGAAAGG KEOCPaCKSG GTTGAT0QG6 540 

AOGCTTCTTC TCTCCCACAG TT6GT6GAGA ■TGATTOTTGA OCAV<3SA!rT G»l3AauSQCC 600 

TATTQAAACC TGAACTTAAQ GATAAfSOTGA CCTATACITT GCICCGGAAG CAiOOGGCATC 6S0 

AAACCAAGAA ATCCAACCTT GOGTOCCTGG CTGACATTGG GAAGAC»STC TCCROTgCTRA 720 

Gnu3(3ATC3TT TA£3CAACCCr OATAATGGTA GOCCAGCCAT GAOCCASTAGG AATCTGACIT 780 

GCrOCAISTCr GAATGACATT TCIGATAAAC GGGAGAM3GA CCAGCTOAAO AATAA GTTC A 840 

*D6AAAAAATr GCCAOGTGAT GCiAGAAGCi:^ CSCAAOaTGCT TGI^QGGGAG CTTOACTITT 900 

TQGATACTCC TTTCATTGCC TTTGTTAGGC TACAGCAGGC TOICATGCrG QOTQCOCTSA 960 

CTGAAGTTCC TGTGCCCAilA AOGTrCFTGT rCCATTCTCTT AGGTOCTAAG GC3GAAAGCCA 1Q20 

AOTCCTACSCA CQAQATTGGC AGMOCATTQ CQ^CQCTQAr OTCTQATaAG GTGTTCCAIG 1060 

ACATICCTTA TAAAC3CAAAA GACMSGGACG AOCEGATTGC TGCIATTGAT GACTICCXAS 1140 

ATGAAGTCAT CQTC9CTTCC». GCTQGGGaAT GOgAlQCSUaC AATTAOSATA GI^jQCCCCCXA 1200 

A6A6TCTTCC ATCCTCTGAC AAAAGAAAiGA ATATGTACTC AGGTO6AQA0 AATGTTCAaA 1260 

IOAATGGGGA 'EACGCCCCAT GATGdAGGTC AOtaGAlSGAGa AGGACATUGGG GATTGTGAAG L320 

AAmSCAGCG AACIGGAOG6 TTCTGTC^TG GACIAATTAA AGACATAAA5 AGGAAAGOGC 1380 

CATTTTTTdC CAOTQATTTT TATGATQC3T TAAATATTCA AGCTCITTCXS GCAAIOrGTCT 1440 

TCATTXATCT GGCAACTGTA ACTAATGCTA TCACTTTTGG AgGaVCTGCTT □GGGATOaCSL 1500 

CTGACAACAT GCSUSGGCGTG TTGGZ^CUVSTT TCXn!GGGCAC TGCIGTCTCT G6AGCCATCT 1560 

TTTGGCTTTT TOCTGGTCAA CCACTCACTA TTCTGAOCAG CACCOOACCT (STCCTRE^rr^ 1620 

TTG3M3AGGCT TCTATTTAAT TTCAGCAAGG ACAATAATTT TGACTATTOS GAGTTTCX3CC 1600 

TTTGGATTGG CCTGTaOTCC GCCTTCCTAT GTCTCATTTT GQTA)C3CC!ACT GATGCCAGCT 1740 

TCTTG3TTCA ATACTTCACA CGTTTCAOQG AGGAGGGCTT TTCCTCTCTG ATTAGCTTCA 1800 

TCTTTATCTA TQATQCTTTC AAOAAGATGA TCAAfiCTTQC A«3AT5JACTAC OCCATCSUiCT 1860 

CCAACITCAA AGTGGGCTAC AACACTCTCT TTTCCTGTAC CTGTGTGCCA CXDCGACCCAG 1920 

CTAATATCTC AATATCT/VAT GftCACCACAC TOGCCCCAtSA 6TATTTG0CA ACTATGTCTT 1980 

CTACTOACAT GTAOCATAAT ACTACCrTTt} ACTGGGCATT TTTGXCOAAa AAGOASIBTT 20 40 

CAAAATA0G6 AQOAAACCrT GTOGGGAACA ACTGTAATTT TGTTGCTGAT ATCACACTCA 2100 

TGTCTTTTAT C3CTCTTCTTG GGAACCXACA CGTCTTCCAT GGCTCTSAAA AAATTCAAAA 2160 

CTAGTOCIXA TTTTCCAACC AjCAGCAAGAA AACTGATCAQ TQATTTTGCC AITATCTIGT 2220 

CCATTOTCAT CTTTT6TGTA ATAGATtSCCC TRGTAGGCGT GGACACCCCA AAACTAATTG 2280 

IGCCAAGTGA GrrCAAiSCCA ACAAGTCCAA ACGQAGGTTQ GTTCGTT0C3^ COGTTTGGAG 2340 

AAAACCCCTG GTGGGTGTGC CTTGCTGCTQ CTATCCCQGC TlU'tfntJ G 'l'C ACTATACTGA 2400 

TTTTCATGGA CXAACAAATT ACAGCTGTGA TTOTAAACAG QAAAGAACAT AAACTCAAGA 2460 

AAGGAGC3V0Q OTATCRCTTTG GATCTCTTTT GGGTGGCCAT CCTCATGeTT ATATGCTOOC 2520 

TrCATGGCTCT tCCGTGGTAT OTAGCTGCTA CQOTCATCTC CATTGCTCAC ATCGACAGTT 2380 

TGAAGATGGA GAO^GAGACT TCTGCACCTG GAGAACAAOC AAAGTTTCTA GGAGTGAGGG 2640 

AACAAAOAGT CACTGCJAACC CTTGTGTTTA TTCTGACIGG TCTGTCAGTC TTTAIQGCTC 2700 

CCATCTT6AA 6ITTATACCC ATGCCTGTAC TCTATGGTGT GTTCCrGTAT ATGGGAGTAS 2760 

GATCCCTTAA TGQTGTGCAG TTCATGGKXC CrCCTGAASCT GCTTCTGATG CCTCTGAAGC 282D 

ATCAGCCIGA CTEGATCrAC CCGOBTCATG TTCCTCTGGG CAGAGTGCAC CTGnOUCIT 2880 

TCKnGCAGGT eTEGTOTCTS GCCCIGCriTT GGAXCCTCZkA GTCAAC^GOTQ GCTGCTATCA 2940 

TTTTTCCRGT AATGATCTTG GCACTTtSTAO CTGTCAGAAA AGGCATGOAC TAC5CTCTTCP 3000 
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OCCAQCATCA CCTCAGCTTC CTGOJVFGATG TCMTCC»GA AAAGGACAAG AliAMU3AAOC; 30«0 

AGOATQAGhA GAAAAAGAAA AAGAAGflAGG GAAGTCTGGA CAGTOACAAT 6ATGATTCTG 3120 

ACTGCCCATA CTCAGAAAAA GTTCCAAGTA TTAAAATTCC AAlOGACATC ATGGAACAGC 31B0 

AACCTTTCCT AAQCGATAGC AAACCTTCIG AGAGAGAAAG ATCACCRACA TTCCTTGAAC 3240 

5 GOCACACATC ATGCK3ATAA AATTCCTTTC CTTCAGTCAC TOGGTATGCC AAGICCTCCT 3300 

AGAACTGCAG TAAAAGTIGC CTCAAATTAG ACTAGAACTT QAACCTGAAjES ACAATGlkTTA 3360 

TTTCTGGAGG AOCAAGGGAA CAGAAACTAC ATTOTAACCT GTTTGTCTTT CTTAAAACIG 3420 

ACATTTOTTG ttaatqtcat ttgtttttgt ttggctgttt gtttattttt taacttttat 34 bo 

TTOGTCTC&G TrTTTCGTCA CAGQCCAAAT AATACAQCGC TCTCrCtCGCT TCTCTCXTGC 3540 

10 ATASATACAA TCAAOACAAT AGIGGACCGT TCCTTAAAAA CAGCATCTGA GGAATCCCOC 3600 

TTTTGTXCTT AAACTTTCAG ATGTGTCCTT TGATAACCAA ATTCTGTCftC TCAnG9U3U3L 3660 

GACACCXACA GACCCTQTCC TTTGCCTCTA TTAAGCftlGAG GATGGAAGTA TTAAGCaATTT 3720 

TGTAAC&CCT TTTATGAAAA TOTTGAAGGA ACTTAAAACT TTAGCTTTGG AGCTGTGCTT 3780 

ACTOGCTTGr CTTTGTCTGG TAGAACAAAC CTTGACCTCC AGACAGAGTC CCTTCTGRCT 3840 

15 TATAGAGCTC TCCAGGACTG GAAAAAGTGC TGCTATTTTA ACirOCXCTT GCITGTAAAT 3900 

OCTAATCT7A 6AGTTATCAA AAGAAGAAAA AACTGAAGGT ACXTTACTCC CTKnUSAiaAA 3960 

ACCATTGCCA TCATTGTAGC AAGTGCTOGA ATGTGCCTTT TTTCCTATGC AACTTTTTTA 4020 

TAACCCTTTA ATGAACTTAT CIGTGGAGTA CATTGAAGAA TATTTTTCTT CCTAGATITT 4080 

GTTGTITAAA TTATGGGGCC TAACCTGCXZA CTTATTTTTT GTCSiATTTTT ARARCTTTTT 4140 

20 TTTAATTACT GTAAABAAAA TGAATTTTTT OCIGCAGCA6 GftAACATAGT TITCAGTAGT 4200 

ITCCAOCTCTT ATTTGTAGCT GCCAGGCTUT CTGTAAAAAT TGTATTGTAT AXAAXGTGKT 4360 

TTTTACaXZAT ACATACACAC ACA2^TACAC AATCTCTAQG GTAAGCCAQA AGGCAAGATC 4320 

AGATIAAAAA CACCATGTTT CTAAGCS^CC ATITTTCCCI TTCTTTAAAA GAAACTTAAC 43 BO 

TGTTCTATGA AGGAGATTGA GGGAGAAGAG ACAAACTOCT ATOTCATQAG AATAACGGAT 4440 

25 GTTCTGATAA TAGTAGCATC TAGGTACAGA TGCTG6TIGT ATTACCAOGT CAATGTCCTA 4500 

TGCAGIA^TTG TTAGACAT7T TCTCATTTTG AAATATTTGr GTC!TTTOTGT AT6TGCICI6 4560 

TOGCATGGCT GOTQTATATA TGlGCAATG<r TAGAAGGCAA AAGAGTGATG GTAGOCAa&a 4620 

GGCAAAGTCA TT6AATCTCT TATGCCAt3TT TTCATAT^AAC CCAAAGCAGA TATGftAAAAA 4680 

TCCATTAAGG GTCCAAGAflS TCTGTGCATA TGRAAATGAG GGTAAATATA QTTTATTTCC 4740 

30 CAGGTATCAG TCATTATAAT TGATATAATA GCTCTAACAT GCAATATAAA ATTCATAGGA 480O 

GTATTAATAG CCCATTTACA CATCTATAAA AIGXftATOGS ATTQCAGAGC TGCaGAOrAC 4860 

AGTGTAACAG TACTCTCATQ CAATTTTTTT CAGGATGCAA AGGCAATTAT TCTTTGTAAG 4920 

CGGOACATTT AGATATATTT GTGTACATAT TATATGTATG TATATTTCAA AQTACCACAC 4980 

TGAAAATTAG ACATTTATTA ACXSIAATTTA AOGTOGTATT TAAAGGTAAT ATTTTTAATA 5D40 

35 TOATACATTA CATA'XTGTGA AT6TATACIA AAAAAACATT TXAAASOTTA AAATVATAAT SlOO 

TTCAGATTCA TATAACCACA ACTGTGATAT ATCXTEAACXA XAACCAISTTG ITGAGGGGTA 5X60 

TACTAQAAGC AGAATGAAAC CACATTTTTT GGTTTGATAA TATGCS^CTTA TTGACZCCCA 5220 

CTCATTGTTA TQTTAATTAA GTTATTATTC TQTCTCCTTG TAATTTTGAT TACAAAAATT 5280 

TTATTATCXrr GAGTXAGCTG TTACTTTTAC AGTACCTGAT ACTCCTAAAA CTTTTAACTT 5340 

40 ATACAAATTA GTCAATAATG ACCX3CAATIT TTTCATXAAA A1AATAGTG6 TGSU^TTATAT 5400 

GTTATVGTGT TAAAAGCICri CITGCCMUkT ICXOGCITCA CATTTOXAIT TAaGGCTAXC 5460 

CTTAAAATGA TGAGTCTATA TTATCTAGCT TTCTATTACC CiStiUCATAAA CIGGTATAAG 5520 

AAGACTTTCC TTTTTTCTTT ATGCATGGAA GCATCAATAA ATTOTTTAAA AACC3UPOIAT SSBO 

AGTAAATTCR GCTTAACCOQ TQATCTTCTT AAmiAAAGG TACTTTTGTT TTATAAAAGC 5640 

45 TCTAGATAAA ACITTCIITT CTQATC3V.TGA AXCAAGTATC TGTGGTTTCA TGCCCX^TCTC 5700 

TATACCTTTC AAAGAACTOC TGAAGGAACX TAACTCATCA TTTOySCSClC TGM3T1U8AGG 5760 

TAAAACCTAT GlGTAiCTTCT GTTTAlGATiC CAXATTGATA TTTATGACAT gAArATftflRA 5S20 

TAGTACCTTA CATTTGCTAA ACaeACROTT AATATCAAAT CCTTTCAATA TTCTGGGAAC 58B0 

CCAGGGAAGT TCTTAAAAAT GTCATTACTTT TCAAAGGAAC AGAAGTAjaTT AAOGAAACTA 5940 

50 ACAftGCAAAA CX7FGAGOTTT AOCXAGTQAC ACCAAATTAT GGGTATTTTA ACTQAATTIA 6000 

GCCATTGACT AAGAATGAAG CX3GATTTGGr GGTGGTTTT6 TTTCXATGCA AACTQGACAC 6060 

AAAXTACAAC AGTAAA'^PTT TTmTAAGTG CTTClGCCIir CTGCATGAIG TGACTTCOBG 6X20 

AGATAAAG6A TTCAAAAGAT AAAGACAAAG TAOGCTCAOA OTTOTTAACC AGA2\AGT0Cr 6180 

GGCTGTGOTT GCAGAAACAC TGTIGGAAGA AAAGASATGA GTAAGTCAAG TGTCIGGCTT 6240 

55 ATGAAAAGAG CAAAAATGCC TCTGCTTTTG TGITTGGGAS AAAAAIA7*CT 1X3GACGCAGT 6300 

CTTTTCCTTG KCMJMSlXA. TCTTCTCTAC TGTGTGftAAT GAATACTTGG AATTCZAftTT 6360 

CTTTTGTG T G CCAGGGGCAG TAATOTCCCT GGCTCXTCfTC a^ATCAAGG TTGAGGAGTG 6420 

GGGCIGGGOA GAGGACTTAA CTGACTXAAG AAG1!AGGAAA ACAAAAAOCT CTCTCCICAG 6480 

CCTTOCACCT OCAAGAGAGG AGGAAAAACA GTTGTCrracr GTCTGTAATT CAGTTTGOGT 6540 

60 G]!ATT!rTAXG CXCATGCACC AAGCCATACA GAGTAAATCI TTTATCAACI ATATACTGCT 6600 

GTTTAAT3U3A QAATOATTGT CTTGCSaAOTT ^CTTTGGTTQC TTVmAACT GIGTTAAAlQT 6660 

ACTTGAAATG TATTGACT6C TGACTATATT TTAAAAACAA AATGAAATAA TTTGAGIIGX 6720 

ATTACAGAGG TTQACATTGT TCAOGOATGG GACRMUSCCT TCTTCZ^ATCC TTTTCATACT 6780 

ACTTAATGAT TTTGGTGCAG GAACCTGAGA TTTTCTGATT TATATTTC&T GATATTTCAC €840 

65 ATTTGCXCTT CACAGCSVTGA GCATGAAGCC CZUSTGGGACX: AAATGGCTGG 9IKCAATCSUV 6900 

GXQATATTTT GTAGCAGCTC ACTATCTOAA AGGCCATGAG TTTTCAGA*EG KrTTCAITGA 6960 

GCITCATTGC AOCCTGAAAT <ETTAAAAAA6 TaXSHGTARXA OSOCAAOCAG TC3JW(3T16T6 7020 

TTTTOSCCAG AGATTTAGAT ATGTCCAATT TOCTGGCTCA ♦TTTCSOTGTG CTCTAIGGOT 7080 

AOGTATAAAA AQCAAGAATT CTGTTTCCTA GGCAAACATT GCAACTCAGG GCTAAAGTCA 7140 

70 TOCftgWaAA CTTTTAGAGC CAGAAGEAAC TTTGTCGCAO TOCTACAATO TOAAAAGAGT 7200 

GAATABIXQC CTCTTTTmS GC3VmTCAT OaCTGGTAak TAfCXGGIAGG GATTACTTIT 7260 

CAGAATCAAT ACGCACi'i'i'C AGATATTCTT ATTTTTATTC TCTTAAGTCT TXATTAACTT 7320 

TGGAGAGAGA AATGATGCAT CTTTTTATTT TAAATGAAGT AGATCAACAT GGTGGAACAA 7300 

AATGATAAAG AACAGAAAAC AmCAATAT ATIACTAATA ACTTTTTdCA AlATAAATCC 7440 

75 TAAAATTCCT ATAACATAGT ATTTTACAGT l^TTAXGAAGC T1TCTATT6T GACTrtTATG 7500 

GAATTAAGAG ATGAAOAAGA tTGAGATATTT TAGCAT7TAT ATTTTICAAA ATIKIKIGTA 7560 

TACTTAAAAA TAAAGXAACT TTATGC 7S86 

Seq ID NO: C163 PHA Sequence 
oO Kuclelc Acid Accespion #; l]H_O00958 
Codlsig sequence: 389-.ia5S 

X 11 2X 31 41 51 

I I I i I I 
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CGGCACAGCC TCACACCT(3A ACOCTaTCCT CCCGCAGACQ AGAOCGGGGS QCACTGCARA 60 

GCTGGGACTC GTCTTTGAAG GAAAAAAAAT AQCGAGTAAG AAATCCAGCR CCATTCTTCA. 120 

CTGAOCXiATC CCX5CTGCACC TCTTGTTTC3C CAAGTTTTTG AAAQCTGGC& ACTCTGACCT 180 

COOTGrCCAA AAATOSACaVS CCACTGMaC 0GC3CTTTGM ABiGGOGAftGA TTTOGCAQTT 240 

TCCAG?^GA GQ^OGACAAQ GTGAAAiSCAG (JTTGGAGGOS GGTCCAGGAC ATCIGAGGGC 300 

TOACCCnGGG GGCTCGTGAG GCTGCCA,OC6 GI6CTGCCGC TACftGACCC^V GOCTTaCACT 3€0 

CCAAGGCTGC GCACOGCCAQ CCftJCmATCAT QTCCACTCOC GQtSGTCftRTT OGTCCGCCTC 420 

CTTGAGCCOC GACOQSCTGA AOU^OCCAGt GACCA^CCCQ QCGSTGATGT TCATCTTCGG 4 80 

GGIOGIGGGC AACCTGCSTGG CCftTCXTrCJOT GCTGTacaaG TOGOGCRAGG AGCAGAACSGA S40 

GACXSACCTTC TACACGCTGQ •EATGTGC3GCT GGCTGTCACC GACCTQTTGG GCACTTTGTT GOO 

GGTtSAiSGCCS <STt5ACCA.TC3G CCACGTACAT OAAGGGCCAA TGGOCCGQGG QCXSUSCOaCI 660 

GTGOSAGTAC AGCACCTTCA TTCTGCTCTT CTTCAGCCTG TCOGQCCTCA GCATCfkTCTG 720 

OQCCRTQAIJr CTCGAGCGCT ACXITGGCCAT CAACCATGCC TATTTCTACA GCCACTACXTT VBO 

GGACAAGOGA TKSGCQQOCC TCAOSCTCTT TGCAGTCTAT GOGTKXauvCG TGCTCTTTTG 840 

CQC2GCTGCCC AACATGGGTG TOGGTAGCTC GCQQCTGCftS TACCCASACA CCTQGTGCTT 900 

GATCGACTGQ ACXZADCRACQ TGACGGCGCA CGCCGCCTAC TOCTACATGT ACGC3GGGCTT 960 

CAGCTCCrrC CTCATTCTOG OCACXJOTCCT CTQCAACCnS CTTGTGTGOG GOQOGCTQCT 1020 

CGGCftTGCAC O0CCIU3TTCA TGCGCOGCAC CTOGCTGGGC ACQaRQCAGC AOCACGCGGC 1080 

CGGGGCC6GC TGOSTTGCCT CCCGGGGCCA CCCCGCTGCC TCCCCAGCCT TaCKBCaCCT 1140 

CAeGGGACTTT CGGC8CCC3CC GQAQCrrrCOG 006CA.TCG0G GGOGCCQAiOA. TCCAGATGGT 1200 

CATCTTACTC ATTGOC&OCT <MCrGGTGGT GCTCATCTGC TCCA.TOCCX3C TOGTGQTGCXS 1260 

AGTATTOGTC AACCAGTTAT ATCAGCCftAG TTTGGAGCGA GAAOTCRQTA AAAATCQWSA 1320 

TTTGCftGGGC ATG0SAAT1G CTTCTGTGAA COCCATCCTA GACOCCTGGA TATATATCCT 1380 

CCTVSAGAAAO ACmOXGClCai GEAAAGCAAT AGAGAAGATC AARTQCCTCT TCTGCCGCAT 1440 

rGGCGGGTCC CGCAGGGAGC OCTCCOQACA GCACTGCTCA GACAGTCAAA OOftCATCTTC 1500 

TCJCCATGTCA GGCCACTCTC GCTCCTTCRT CTCCCGGQAG CTQftWGGAGA TCAGCftGTAC 1560 

ATCTCAGACC CTCCTGCCAiQ ACCTCTCACT GCCAGACCTC AGTGAAftATG GGCTTOGAGG 16^0 

C2U3GAATTT6 CTTOCAGGTG TGCCTCOCAT □OaCCrOQCC CAGGAAGACA CCACCTCACT 1680 

GAOGACTTTa COAATATCAG AGACCTGAfiA CTCTTCACftG GtJTCAOGACT CAGAGAGTGT 1740 

CTTAGTG6TG GATGRGGCTG GTGG(3?«30OQ CAGGOCTGGG OCTGOCCCTA AGGGGAGCTC 1800 
0CTGGAAJ3TC A<2ATrrCCCA GTGAAACACT GAACTTATC3V. GAAARATGTA TATAATAGGC 1860 
AAGGAAAGAA ATACT^GTACT OTTTCTGGAC: CXTTTATAAAA TCCrGTGCAA TASMStCKCA 1920 
GRIGTCACAT TTAGCTOTQC TO^AAGGGC TATCKICA. 1958 

Geg ID NO I C164 UNA Sequence 

Kfucleic Acid AcceBsiom #: 13M_Q026S9.1 

Coding sequence! 427.. 1434 



1 11 21 

1 I I 

CAGTATCCCT CCTQACAAAA C^^AACAAAAA. 
TATTTACOGT CAAAGTTTTT ATCGTCATTT 
OCAOOTTAOG AABAGA0AOA ACn30GA<XTT 
ATCACAACTC CATGAGTCAG GGOOGAiGCCA 
GGAAGGAAGT TTGTGGCGGA GGAOOTTCQT 
GGGGCTGACT OGCTCTTTGG GAAAAOGTCT 
TCCTTCXTTGA GGCCAGAAi3G AaASAAX3A£!G 
CGCGAGATGG GTCACCGGCX: GCTGCTGCG8 
OOCTCTTGGG fSGCTGCOSra CATGCAiGTG? 
TGOGCXXTTGG GT^CAOGACCT CIGCAGGACC 
OA<?CTCK3AjSC: TGGtTGGAGAA AAGCTTGITLCG 
TATGGGACTG GCTTGAAGAT CACCAGCCX'T 
AAOaVOGGCA ACTCTGflCCG GCSCaXSTG&CC 
TCCTGTGGCT CATCAGACAT GAQCTOTGAO 
nSCGCXGAAG AIVCAGTGGCT GGATGTG6IG 
CGTCCAAAGG ATGACC3(3CCA CCTCCX7IGGC 
AATOGTTTCC ACAACAACGA CACGXTCCAC 
AAOGAQGGOC CAATCCTOGA QCTTGAAAAT 
TGCAAGGGGA ACAGCACCGA TQGATGCTCC 
OaOQOCATGA ATCAATQTCT G6TAGCCACC 
A3X3GTAAGA6 GCT^GCAAC OGCCXCAATQ 
ABCATGAAOC JUCP^TIQKFGT CTCCTGCT6T 
GATGTCCftGT ACOGCAGTGQ GGCTOCTCCT 
ATCACCCTX3C TAATGACTGC CAGACTGTGG 
AA'TOCCCCTC TCTOCCaCTGG CTGGATCJCSQe 
GCGCTACAGA CTTGCI6T6T GAGCTCAQGC 
TOCGAGCTAT GAAAACAGCT ATCTOVCAAA 
GAAGGCCGT6 GGGMVrOGGA GAGCTCL'TGT 
GTTGTTATTA ATXAAXAXTC ATATTATEXA 
TOG 



31 41 51 

1 I 1 

ICCTGTTA6C CAAATAATCA. GOCACftTICA 
TACSWSCAlSTG OAGACCGATT GGCCCGOGTC 
6CAOCCAGGC AATCTGGQGA CAGAGGTGTG 

QCCcxirrTCAC caccagcggg ccG0Gcx;ca3 

ACGGQAiQGAG GGGGAGGOSC CCAOQCATCX 
GGGAQOn^rC CCTGGGGCCA CAAAACIGCC 
TGCAOQGAGC COGC3GCAGM3 0AQCTGOCCT 
OTSCXGCTGC TGGTOCACAC CTGOGTCOCA 
AAfiACCAAGG GGGATTGCCG TCJTGGAAGAG 
ACGATQGTGC GCX^KSTGGGA AGAAGQAGAA 
GACTCAGAGA AGI^GCAACAG GACCCT6AGC 
AC(73A6GTTG TGTGTGG6TT AOACTTOTOC 
TATTCCCGAA GCCGTTAOCr GGAATGCATT 
AGGGGCCGGC ACCAGAGCCT GCA^TOCdSC 
ACCCACTGOA TOCAl»3AASG IGAAGAAGGO 
^TGOXSGCTACG TTCCGGGCTG COCOCSGCXOC 
TTCCIGAAAT GGTGCAAOU2 CAOCAAATGC 
CTQCOSCa^fiA atgqc3ogcx:a otgttacrgc 

TCTGAAGAGA CrTTTCCl^CAT IGACIGCCGA 
GGCftCICAOG AAOOQAAAAA OCAAAGCXAT 
TQGCAAiQLTCr CCCACCTGGS TGAOQCCITC 
ACTAAAAGTG GCTOTAACCA CCXSVGACCIG 
CRGOCTGGCC CTGCOCATCT CAQCCTCACC 
G6AGGCACIC TCCTCTGGAC CIAAAOCTGA. 
GGGACCCCTT VSCCCSTCCC TCGGCTCOCA. 
GAGTOnaCOG ACCZCXCTG6 GOCTCAGnT 
GXXGTGrGAA GCaGAAQAGA AAAGCTGSA6 
TATTATTAAT ATTGTTGGGG CTGTTOTOTT 
TXITAXACIT ACATAAAGAT TTTOTAOCAG 



&eq ID siOi C165 DMA Sequence 
nucleic Acl4 Accession ft: AK0a7843. 
Codiiig fiequencet 193.. 1731 
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1 

! 

TTGCTTGAGT 
CTAAATAGCA 
CTOTTAiOauS 
TCOTATTTCSC 
CCTCCAAACT 
TTTACTTTCT 
GTGAGTTATG 



11 
1 

CATCTTCIGA 
CATCACATQT 
GGACAAATGC 
AOATGOATXX 
TACnGAGAA 
TCAACAAAAC 
TGATGGOSTG 



21 
I 

AGCTTTAAAA 
6AATATTACA 
AAZTTCAAAl: 
TSAGAGTOGA 
TTTAAGTCCA 



CSVGTATTGGA 



31 
1 

ACAATEGATa 
ACTOGGAACr 
TTTAQCATTQ 
CAAGTGGATC 
GAAGATTCTB 
CAGGAXGTAG 
AACaOTACrA 



41 
I 

AATTQGOCTT 
TOGCTCTCAG 
GTCITGCAAG 
CRCTGGCATC 
TAT^XAGTTAO 
GACCCCA2^AG 
TCCAGAATCT 



51 
I 

CAAGATAGAC 
CGTATCATOC 
CAArAAlGAA 
TG'TAATTinS 



AAAAACXTTA 
GAAGGATGCT 



60 

120 

180 

240 

300 

360 

420 
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GTTCAAATAA. AAATCAAAOV TACiUGAACT CAGCaAAGTGC ATCKTC3CCAT CTGTGCCTTC 4B0 

IGGGATCTGA ACAAAAACAA AAGTTTTSGA GGATOGAACA GGTCAQGATG TGlTGCACAC 540 

A13AGATTCAe ATGCAASTGA QACAGTCTGC CTGTI3TAACC ACTTCACACA CTTTGGAGTT SOD 

CTGATGGACC TTCCAAGAAG TGCCTCACAG TTAGATGCAA GAAAOVCTAA AGTCCTCACT 660 

5 TTCRTCAGCT ATATTGQGTG TGOAATATCT GCTATTTITT CAGCAGCAAC TCTCXnXSACA 720 

TATGTTOCTT TTGAGAAATT GCGAAGG6AT TATCCCICCA AAATCTIGAT GAAOCXOAQC 7aO 

ACAGOCCTGC TGTTCCTGAA TCTCCTCTTC CTCCTA(^TG GCTOGATCAC CTCCTTCAAT 840 

GTGGATGC?AC TTTOCAlTGC TGTTQCAGTC CTGTTGCATT TCTTCCTTCT GGCAAOCTTT 90 0 

ACCTGGATQG GGCTAGAAGC AATTCACATG TACATiGCTC TAGTTAAAGT ATTTAACACT 9SD 

10 TACATTCX3CC GATACATTCT AAAATTCTGC ATCATTOGCT C3C3GGTTTGCC TGCXZTTAGTG X020 

GT6TCAQTTG TTCTAGCGAG CAlGTMACAAC AATSAAlSTCf ATQSAAAAGA AAGTTATG3S 1080 

AAA6AAAAAG CTTOATGAATT CrGTTGGATT CAAGATCCAG TCATATTTTA TGTGACCTGT 1140 

GCTGGGTATT TTGGAGTCAT GTTTTTTCTG AACATTGCCA TGTTCATTGT GGTAATOOTG 1200 

CAOATCTOTQ GQAGQAATGG CAAGi^GAAGC AACCGGACCC TGAOAGAAGA AGTGTTAAGG 12 CD 

15 AACCTGCGCA 6TGTG6TTAG CTTGACCTTT CIOTTGtSGCA TGACATGGGG TTTTQCATTC 1320 

TTTQccraaa gacccttaaa tatocxxittc atgtacctct TcrccstrCTr caattcatta 1380 

CAAGGCTTAT TTATA7TCAT CTTCXT^CTGT GCTATGAAGG AGAATGTTCA GAAACAGTOG 1440 

CDQCQGCATC TCTGCTGTGG TAGATTCCGG TTAGCAGATA ACTCAQa™? GAGTAAGACA 1500 

GCTAOCAATA TCATCAAGAA AAGTTCTQAT AATCTAGGAA AATCTTTGTC TTCAAGCTCC 1560 

20 ATTGC3TTCCA ACTCAACCTA TCTTACRTCC AAATCTAAAT OCAGCTCTAC CACCTAOTTTC 1620 

AAAAG6AATA GCGACACACA TAATGTCTCX: TAT6AQC2ATT CCTTCAACAA AAGTOOATCA 1£B0 

CTCAOACAOT GCTTCCATGG ACAAGTCCTT GTCAAAACTG GCCCATGCTG ATGGA6ATCA 1740 

AACATCAATC ATCOCTGTCC ATCAQOTCAT TQATAAGGTC AAGGGTTATT GCAATGCTCA 1800 

TTCAGACAAC TTCTATAAAA ATATTATCAT GTCAGACACC TTCauSCCACA GCACAAAGTT 18 GO 

25 TTAATOTCTT TAA0AAAAAO AAATCAATCT GCAGAAATGT GAAQATTTOC AAOCAGTGTA 1920 

AACTGCAAiCX AGTGATGXAA. ATGTGCTAXT ACCTAQOTAA CTGCSURATAT ATAAGGAATO 19B0 

TATTTTOTTA AGAAGGCTTT TGTGAAATTC AGAATTTTTC TTTTTAATAT ATTTCTrrCCA 2040 

ItSGAAGAGTT GTCATCACTA AAACTTC3M3T ACTOAQAGTA ACATGACTCR GTAGCCACAG 2100 

AAGCTATGAT TTOTAAAATA TATAATTGAA TCAGAGTAAT CATAATGCAS GGGAGACATT 2160 

30 CAAATXAS3AG ACAAGGSAGA AGCAATGCTO AQGAAGACOC TAGATAGAGC TCATTTTACT 2220 

CCACCTAATC 6TIATATCTG GATATACGCA TTTTCTGCAT CTTCTTTCTC AACAATAAAC 2280 

TGTGCTT6CT TTGGAGACTT TAAGACSTTT CCTAAAGCAC AAATAAAAGC CTCGTATTTC 2340 

CC5CATTQAQA QTTTTGTTCC AAOGAATATG AAGTGAGACA TATGGOTGAG TCATAATTiAT 2400 

CAAAATAATT TATGAAGAGC TGGGTCTQCA ATAQCTASTC TAAAAACTAC TTGTGTGTCA 2460 

35 QTCCTCIGOT TATAGTATAr AAGAGCCTGA GGAGGTCTGG CAAGATAGAT GGTGIATTAT 2520 

TIATQGATCA GGCIGCTOCA TACAAACCIT GCATACIATT ATGCAGCTTA GCTAACTCTC 2580 

ABACTATTCT GAGTAATGCT XGCTTGCTAA TCaUVTGTATA GGAQACCACA 7TGTAATTGT 2640 

TCTTAGATGA TGGAGTCX:AT GC3«3TTTCTT AQAAATOGGT CTCAGTGCRT GCTGTQCTTT 2700 

TTCAJCATTTO CTCTGGGTTA TClGGGTkAGT ATCAGGTTCT GGGAGGCAAC AfiCATXAAGT 2760 

40 GATAAGAAAA GGM3ACATTC TGGCAAAGCC AATCTGCTTA AAG6CAAAGT OCAOAACXIITa 2820 

OAACCTAOaa GCCITTCTCrr CTGGACGAAA AACAGGIAGT TTGOUBTCTO AGASIATGGGA 2880 

GAGCITTTAG GCXACACA6C AACOCAAOGQ AGCTCXCACSC TTTTGCTGAG CITCAATCAa 2940 

OAAGCTATTT GaCXGGCEGC AGCAGATGAT GAGAXAATGA GGTAQTGGGT TLTCTimNZ 3000 

TGTTCCATTT TOCAAGATOC TGCAACACCA TCCXGGGAGA CAAGAGCATT AOOCAGCTTG 3060 

45 GCTTTCACQG GOGSUaGSTTO TATTCAiGT 3088 



50 
55 
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Seq ID nOt C166 QEA Sequence 
ZVuclelc Aald AccesBloa ]iH_000574.1 
Coding sequence I 66.. 1211 



1 
I 

OOGCTGGGCG 
GOGCCATGAC 
CCGGGC3X3C7 



Cr&X2VAS*AAC 
TGATCTGCCT 
AGGTGCCAAC 
TTCC»GTGG8 
TATCAOCAAA 
AAAAGAAATC 
GCATATTATT 
OGACTTCTAG 
AOIGCAQAjQA 
AAGGTGAOCA 
TGATTGGAGA 
CACCACCTQA 
CTAOCACAGT 
AAACCACCAC 
ATTTTCATGA 
TATCIGGGCA 
TGCTGACTTCA 
AGTTTCTTAO 
ATQCTTICAT 
GTCCTGGAAT 
TGAGAGTGAT 
GGGATCAOSA 
CCACTTATAA 
CCCAATTCAG 



11 
1 

TAiSCTGOGAC 
OGTOE3C3GCX3G 
GCIGCTGGTG 
ACCTA2VXGGC 
6TACAAATGI 
TAAGGGCAOT 
AAGGCTAAAT 
TACTGrTOTQ 
ACTAACTT<3C 
ATGCGC3!AA3 
TGQTGCAACC 
TTTTTGTCTT 
AATTTATTGT 



21 
I 

*rCGGCGGAGT 
COGAGCOTOC 
CTGTTSTGCC 



31 



43. 



51 
I 



GAAGAAAGCT 
GAATQGTCAQ 
TCTGCAXCOC 
QAATATGA^ 
C^nCnGAATT 
OQGGGAiGAAA 
ATCTCCTTCr 
ATTTCA6GCA 

ccAGCACcsu:: 



GCACTCTATT 
ATQCAGAGGA 
AAAtOTTCCA 
AOCAAAXGCr 

aacaaccgca 
cacctgtttc 
qccsuu^aag 
acttatcxgc 
igtctttaag 

CACATTCTTA 
TCCTTTOCTA 
GGAAAAGAGA 
AOGAAATAAA 
TCTCTTCTAA 



ACATQCTGGT 
AATCTTCCTT 
CTTAATQTCT 



^AAAAGTAT 



OCQGGOGGOG CGTCCnGTT (SAACOOGGC 60 

CCGOGGOGCr GCCCCTCCTC GGGQAGCTOC 120 

TGOOGGOCXSr GTaaQOTGAG TGT6GCCTTC 180 

TGGAAGG0G8 TACAAGTTTT CCaaAGOnXA 240 

TTGIGAAAAT TCCTGGOaAG AAfiGACTCAlG 300 

ATATTGAAGA GTTCTGCAAT OGTASCXGCG 360 

TCAAAC3U3CC TTATATCACT CAGAATTATT 420 

GGOGTOCM9S TTACAOAAGA OAAOCTICTC 480 

TAAAATGOTC CACASdUSTC GAATTTTGTA 540 

TACGAAATGG TCAGATTGAT Gl^ACCAGGTG 600 

CATaTAA£2AC AGQGTACAAA TTATTrGGCT 660 

GCTCTGTCCA GTQGAQTGAC CCGTTGCCAG 720 

CACAAAXT6A CAAT6GAATA ATTCAAGGOS 780 

TAAOGTATGC ATQTAATAAA GGATTCAOCA 840 

TATTGTACTG TGAATAATGA rEGAAGGStfaAO TGGAfiTGGCC 900 

AAATCTCTAA CTTCCAAGGT OCCACCAACA GTTCAiC3AAAC 960 

ACTACAGAAG TCTGAGCAAC TTCTCAGAAA ACCACCACAA 1020 

CAAfiCAACAC GGAGTACACC TOTTTCiCAaa ACAAGCAAGC 1080 

AATAAACXSAA 6raGAAIX3lC TfCAOGTACT ACCXX3TCXTC 1140 

ACGTTGACAO GTTTQ C IT G G GAOGCTAGTA ACCATGG6CT 1200 

AOTTA7\GAAG AAAATACACA CAAGTATACA GACTGTTCCT 1260 

AT&TTGGRTA AAATAAATGC AATTGTGCTC TTCATTTAGG 1320 

AISTGTTAGG AATGlCAACA GAGCAAGGAG AAAAAAGGCA 1380 

GC3^CACC1AC ACCTCTTGAA AATAGAACAA CTT6CAGAAT 1440 

AAAGTGTAAG AAAGCSVTAGA GATTTGXTGG lATTTAOAAT 1500 

AGGAAAGTGA TFTTTTTCCA CAAGATCTGT AATOTTATTT 1560 

AAATGAAAAA CATTATTTGG ATATCAAAAG CAAATAAAAA 1620 

GCAAAA1T6C TAAASAlOAGA TGAACCACAT TATAAAGTAA 1680 

TCATCTTTOC TXCGGGTTOG CAAAATATTT TAAAGGTAAA 1740 

ZGTTGATGGT QATAAQGGAG GAATATAGAA TGAAAGACTG IB 00 

ATAGAGTTTG GAAAAASQCX GrGAAAGOTG TCXTCTTTGA 1660 

CCAGAGATAC ^EACAATATTA ACATAAGAAA AOATXATATA 1920 
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TTATTTCTGA ATCGAGATGT CCATAGTCaa ATTTGTAAAT CTTATTCTTT TGTAATATTT 
ATTTATATTT ATTTATQACA GTGAACATTC TGATTTTACA TGTAAAACAA GAAAAGTTOA 
AOAAGATATO TQAAGAAAAA TGTATTTTTC CTAAATAOAA ATAAATGATTC OCATTTTTTG 
ST 

8eq XD KOs CL67 DHA Seguence 

KTuclelc Acid Accession ft; fios sequence 

Oodlng sequence : 1 _ . 2 6 51 
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1 
I 

ATQQACACCr 
GGGGGCAGCT 
GAGCaca7«33 
CCTTOCAACC 

aCTCTGACAT 
CTGCAGAATA 
CAATCCCTGC 
CTGCATTCX?C 
GCTTTTAGAA 
ATACCASACr 
AGAATOCRCT 
TTAAATTACA 
GAACTACATT 
CCTSAACXAA 
ACTQQAACTG 
CAAACOGTCT 
GAAGATTTAC 
OAAATCTACG 
TTGGCTTGQa 
ATAAACCTGG 
TTAACTCACT 
TTTCCAGAAC 
TGXGAOhATa 
GAOCTTCATA 
TTOCTQCTTO 
CCAGGCOCCT 
TGSACCATAQ 
TCCOCTCTGT 
CTCACQGOAG 
GCACX^CATG 
ATXTTTtJCTT 
TCTOTGAAAT 
A^TTBCTCT 
AAGTATGGCS 
rACATGGTCXa 
ACCAAGCTCT 
GTAAAACACA 



CCTCACTTXA 
AAACAOCCAA 
ACTCAAGCCT 
GTGCCATCAC 
CCATCSTCTTA 



11 
I 

CCCGGCTGGG 
CXCOCAGGTC 
GCAGGATGTT 
TCAGCGTCTT 
ATC00CTGC3C 
ACATTCCCAA 
ATCAGCTAAG 
GTCTGGATGC 
T6AGGCACCT 
GTTXATCQGC 
ATGCCTTTGG 
CCCXGGGAAA 
ATAACCTTGA 
TCTATGACSiA 
GAACACTQAC 
CAAACCTGGA 
GCAATCAC3TT 
CCAGTTTTTC 
AAASTTAAAGT 
ACRAAATTGC 
ACCTAXOGTC 
TAAAATTAAC 
TCAAGGTTAT 
CCTATAAGAT 
AGAAAGATGC 
ACmOAGGA 
TCRAACCCTO 
CAGTTCTGGC 
ACATXTCCCX: 
TCTtZCAOTGC 

CaOAATCATC 
ATOXTTGCRAA 
6TG00CTGCT 
CCTDCCCTCT 
CTCTCATCTT 
ACTGCAATTT 
TTGCCCTGTT 
CCTCITTAAT 
TAGTOOCACT 
AGGAGGATCr 
GCTTGATGTC 
TGGTAACCTT 
CACCTTATCC 
A 



21 
I 

TGIGCTCCT6 
TGQTGTGTTG 
GCTCAQGGTG 
CACCTCCTAC 
CAGTCTCOGC 
OGOAGCATTC 
ACAOGTACCC 
TAACCACATC 
GXGGCTGGAT 
ATTQCAAGCC 
AAAOCTCICC 
□AAATGCTTT 
TGAATTCCCC 
TCCCATCCRA 
TCTGAATOGT 
GAGTCIGACT 
ACCTAATCTC 
AGTCTGCCAA 
TOACRCTTTC 
TATTATTCAC 
CAACCTCCTG 
AGGAAATCAT 
AGAAATOCCT 
TTCTAATCAA 
TGQAATQTTT 
AGACCTGAAA 
TC3AACA.CCTG 
ACTTACTTGT 
CATTTAAACIG 
CGTGCTGGCT 



ATTTGAAAOG 
GQCCTTGACC 
CTGCCTGCCT 
GCTCAATTCX: 
GCACAM906A 
GCICTTCRCC 
AAACCTTACA 
TCCTGCATGT 
GQTQAOCCTO 
AATTATfcCTCr 
TACcaacTTCx: 
AGTGACTGAG 



31 

! 

TCCTTGOCTB 
CTGAGGGGCT 
GACTGCTCCG 
CTAGACCTCA 
tCTCCTGSAOS 
ACTGGCCTTT 
ACAGAAGCTC 
AGCTATGTGC 
GACAATGCGT 
ATGACCTTGG 
AGCTTGGTAa 
GATGGGCTOC 
ACTGCAATTA 

GGCTCACRAA 
TTAACTGGAO 
CAAGTGCTAG 
AAGCTTCAQA 
CAQCAGTTGC 
COCaWXGCAT 



GCCTTACAGA 
TATGCTTAOC 
T6GAATAAAG 
CAOGCTCAAS 
GCCCTTdATT 
CTTGATGGCT 
AATGCTTTOa 
TTAATTQGGG 
GGTGTQGATB 

CrmCTCTGG 
ARAGCTCCAT 
ATGGCXX3CAG 
TTGCCTXXXG 

GAGCTGGAGA 
AACTGCATCC 
TTTATCAGTC 
GXC3ATCCC3C 
AGAAAGGAAA 
GATGATGTCB 
AGCATCACTT 
AGCaySOCATC 



41 
I 

TQCTTQCTGCA 
GCCCCACACA 
ACCTGqGQCT 
GTATGAACAA 
A(3TTAQQTCr 
ACAGTCTTAA 
TGCAQAATTT 
OCCCaAGCEG 
TAACAGAAAT 
GCCTGAA£*AA 
TTCTACATCT 
ACAGCCTAGA 
GGACiUZTCTC 
GATCTGCTTT 
TAACTGAATT 
CACAGATCTC 
ATCTGTCITA 
AAATTCJRCCr 
TTAGGCTCGG 
TTTCCACTTT 
CTATAACTG6 
GCTTC3ATATC 
AGTGCTGTGC 
GTGACAACAG 
ATGAAGGIGA 
OUSTQCAGTO 
6SCTGATCA6 
TQACTTCAAC 
TCATCGCAGC 
CGTTCACTTT 
AT6ICATTG6 
CAGCOCTGGA 
TTTCIAGCCT 
TTCOCCTGCT 
GOGAGCXrCAG 
TCRTGATGAC 
ATATTTGGGA 
TAAACTGCCC 
CXGAAGTAAT 
TTCXCTACAT 
OCXAGGXCnG 
AAAAAC3VGTC 
ATGACCTGCC 
l^TCCTCTGT 



51 
I 

GCXGGCGACC 
CTGTCAXXGC 
CTCGGAGCTG 
CAICAGICAG 
TGCGGGAAAC 
ASTXCrrAXG 
GOGAAGCCTT 
TTTCASTGGC 
CCCCGTCCA6 
AATACSkCCRC 
CCATAACAAT 
GACTTTAGAT 
CAACCTTAAA 
TCAACATTTA 
TCCTQATTTA 
ATCTCTTCCT 
CAACCTATTA 
ARGACATAAT 
ATCSCTGAAT 
GCC^TOOCXA 
GTTRCATGGT 
ATCTGAAAAC 
AT^GGAGTG 
CAGTA TGGAC 
GClTGAAlGAT 
TTCACCTTCC 

AGTTITCAGA 
AGXOAACAia 
TGGCAGCTTT 
TTTTTTOTCC 
GOGTOGGTTC 
GAAAQTAATC 
GOGIGGCAGC: 
CACCATGGGC 
CATTGCCTAC 
CTGCTCTATQ 
T6TGC5CTTTC 
TAA{31TTATC 
CTTGTTCftAT 
GACMUSATCA 
CTGTGACTCA 
TC(3C!RS!PXCC 
OGCATTTGTC 



Seq ID HQ: C168 DNA Sequence 

VucLelc Acid. Accession #i NK_009667.2 

Coding nequencet 49.. 2772 



1 
I 

TGCTGCTCTC 
CGOCTOGGTG 
CCCftGGTCTG 
AGGATQTTGC 
AGOGTCTTCA 
CCCCIGCCCA 
ATTOCCAASB 
C3VGCTAAGAC 
CIGGATGCTA 
AGGCROCTGT 
TTATC3GGCAT 
6CCTTTGGRA 
CTGGGAAAGA 
AACCTTGATG 
CATAGCAACA 
ACAAThCATT 
CCTGAACTAA 
ACTGQAACIG 
CAAACC3GTCT 
GAAGATTXAC 
GAAATCTAOG 




OTGrroTTQcr 

TCAGGGTGGA 
CCTCCTACCT 
GTCTOOGCT? 
GAGCATTCAC 
AGGTACCCAC 
ACCACATCRG 
GGCTGGATGA 
TGCAAGCCAT 
AOCTCTCCAG 
2VATaCTTTC3A 
AATTCCCCAC 
ATATCRGOTC 
TCXATGACAA 
GAACACTGAC 
CAAACCTQGA 
GCAATCAGTT 
CCAGTTTrrC 
.AAftXllAAAGT 



21 

! 

OGGCTOGTGG 
CTTGCCTGTG 
QAGGQGCTOC 
CTGCICOGAjC 
AQACtrxVAGT 
CCTGGAGGAG 
T6G0CTTTAC 
AGAAGCTCTO 
CTATGTGCCC 
CAATOGGTTA 
GACCTTOGCC 
CrrrGQTAGTT 
TGGGCTCCAC 
TGCAATTAGG 
QATACCIGAG 
TOCCATCXZAA 
TCTGAATGGX 
QAGTCTGACT 
ACCTAATCTC 
AGTCTGCCAA 
TGACACTTTC 



31 
I 

CCGCCTACTT 
CTGCTGCAGC 
CCCACAjCACT 
CTGGGGCTCT 
ATGAACAACS^ 
TTAOGTCTTG 
AGTCTTAAAG 
CAGAATTXGC 
CCAAGCTQTT 
ACAGAAATCC 
CTGAACAAAA 
CTACATCTCC 
AGCCTAGAGA 
ACACTCTCCA 
AAAGCATTTG 
TTTQTTQQGA 
QCCTCACAAA 
TTAACTGGAG 
CAAGTGCTAG 
AAGCTTCAGA 
CA6CAGTTGC 



41 
I 

OGOGCACCAT 
TG6CGACGGG 
6TCATTGCGA 
OGGAGCTGGC 
TCAGTCAGCT 
C6GGAAAC6C 
TTCXTATGCT 
GAAGOCTTCA 
TCAGTGGOCT 
CCGTCCAGGC 
TACACCACAT 
AXAACAATAG 
CTTTAGAT^T 
ACCXTAAAGA 
TAGGCAACCC 
GATCTGCTTT 
TAACTGAATT 
CACASATCTC 
ATCrOTCTXA 
AAATTQACCT 
TTAGOCTCCG 



51 
I 

GGACACCTCC 
GGGCAGCTCF 
GCCOQAOGQC 
TTCCAACCTC 
GCTCOOGAAT 
TCTGACATAC 
GCAGAATAAT 
ATOCCTSOGT 
GCATTCCCTG 
TTTTAGAAGT 
ACCAGACTAT 
AATOCACTCC 
AAATTACAAT 
ACTAGOATTT 
TTCTCTTATT 
TCAACATTTA 
TGCTGATTTA 
ATCTCTTOCT 
CAACCTATTA 
AAGACATAAT 
ATGGCTGAAT 



SO 

120 

IBO 

240 

300 

420 

4&0 

S40 

600 

660 

720 

780 

a40 

900 

960 

1020 

1060 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

LBOO 

1860 

1920 

19B0 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

25B0 

2640 

2651 



60 
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180 

240 

300 

360 

420 

460 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

lOBO 

1140 

1300 

1260 
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TTGGCTTGGA 
ATAAAGCTGG 
TTAACTCACT 
TTTCCAGAAC 
TGTQAOAJ^TO 
GIICCITCATA 



CCAGGCCCCT 
TOQACCATAG 
TOCCCTCXGr 
CTCACOG6AG 
GCACGACATG 
ATTTTTGCTT 
TCTGTGAAAT 
ATT^TGCTCT 
AAGTA.TGGGG 

ACCAAGCTCT 
OTAAAACACA. 
TTGrCCTTCr 
CTTCXGOTGG 
CCnCACTTTA 
AAACACOCftA 
ACTCAAOOCT 
QTQCCATCAC 
OCArGTCTCT 
BAQATTQAST 



ACAAAATTGC 
ACCTATCGTC 
TAAAATTAAC 
TCAAGGTTAT 
CCTATAAGAT 
AGAAAGATGC 
ACTTTfiAOSA 
TCAAACCCTG 
CAGTTCTGGC 
ACATTTCCCC 
TCTCCAGTGC 
GTGCCTGGTG 
CAGAATCATC 
ACTCTGCAAA 
GTGCCCTGCT 
CCTCCCCTCT 
CTCTCATCTT 
ACTGCAATTT 
TTGCCCTQTT 
CCTCTTTAAT 
TAGTCCCACT 
AGGftGGATCT 
GCTTGATGTC 
TGGTAACCTT 
CZAOCTTATCC 
AATTAATATG 
ATATCAGJUK! 



TATTATTCAC 
C?ACCTCCT6 
AQGAAATCAT 
AGAAATGCCT 
TTCTAATCAA 
TGGAAT6TXT 
ABACCTGAAA 
TGAACACCTG 
ACTTACTTGT 
CATTAAACTG 
CGTGCTGGCT 
OGAGAAXGOG 
TGTTTrCCTG 
ATTTGAAACG 
GGCCTTGACC 
CTGCCTGCCT 
GCrCAATTCC 
66ACAAGOGA 
GCTCITCACC 
AZVACCTTACA 
TCCTGCATGT 
GGTGAGCCXG 
AATTAACTCT 
TACCAGCTCX: 
AQTBACTQAG 
TGAAGGAAAA 
ACSTAATTAAT 



CXXZAATGCAT 

GCCTTACAGA 
TATGCTTACC 
TGGAATAAAG 
CA0GCTCAAG 
GCCCTTCATT 
CTTGATGGCT 
AATQCTTTGQ 
TTAATTGGGG 
GGTGTGGATG 
GTTGGTTGCC 
CTTACTCTGG 
AAAGCTCCAT 
ATGGCCQCAG 
TTGCCTTTTG 
CITTGCTTCC 
GAGCTGG&GA 
AACTQCATCC 
TTTATCAGTC 
CTCAATCCCC 
AGAAAGCAAA 
GATGATQTCG 
AGCATCACTT 
AGCTQCCATC 
TGTTTTCAAA 
AAOAAGAGCT 



TTTCCACTTT 
CTATAACTGG 
GCTTGATATC 
AGTGCTGTGC 
GTGAtrAACAG 
ATGAACGTGA 
CAGTGCAGTG 
GGCTGATCAG 
TGACTTCAAC 
TCATOGCRGC 
CGTTCACTTT 
ATGTCATTGG 
CAGCCCTOQA 
TTTCTAGCCT 
TTCCCCiaCT 
GGGAGCOCAG 
TCATGATQAC 
ATATTTGGGA 
TAAACrTGCCC 
CTGAAGXAAT 
TTCTCTACAT 
CCTACGTCTG 
AAAAACAOTC 
ATGACCTGCC 
TTTOCTCTGT 
G^i-i-VaAGAAC 
GAGGTGAAAC 



GCCATCOCTA 
GTTACATGGT 
ATCTGAAAAC 
ATTTGGAGTG 
CAGTATGGAC 
CCTTGAAQAT 
TTCACCTTCC 
AATTQGAGTG 

AGTrrrCAGA 

AGTGAACRTG 

TQGCAGCrrr 

TTTTTTGTCC 
GCOTQQOTTC 
GAAAGTAATC 
GQOTGGCAGC 
CACCATGGGC 
CATTOCCTAC 
CTGCTCTATG 
TOTOGCTTTC 
TAAGTTTATC 
CTTGTTCAAT 
GACAAGATCA 
CTGTGACTCA 
TCCCAGTTCC 
GGCATTTGtC 
CTGAAAATOT 
T06GTTTAAA 



1320 
13B0 
a44D 
1500 
1S60 
1620 
1680 
1740 

laoo 
i&eo 

1920 
1980 
2040 
2100 
2160 
2220 
2260 
2340 
2400 
2460 
2520 
25B0 
2640 
2-700 
2760 
2820 
2880 



Seq ID NO: C16d 13UA Bequcfiice 

nucleic Acid Accession #e HM__003506.1 

Coding sequence: 2S9. .2379 

1 11 21 31 41 SI 

) I 1 1 1 1 

GCAGCTCCAG TCCCGGAOGC AACCCGGGAG CCGZCXCAOG TCS2CTGGGGG GAACGGTGGG 60 

TTAQAOGGGO AC3GGGAAGGG ACAGCGGCCT TOGACC5GCCC CCX33AGTAAT TQACKXAOQA 120 

CTCAT^TTTCA GGAAAGCCTG AAAATGAGTA AAATAGTGAA ATEGAGGAATT TGAACATTTT 160 

ATCTTTGGAT GGGGATCTTC TGAGGATGCA AAGAGTGATT CTTCCAAGCX: ATGTQGTAAA 240 

ATCAGGAATT TGAAGAAAAT GGAGATGTTT ACATTTrrGT TGACX31GTAT TTTTCTROCC 300 

CTCCTAACAG 6GCACRGTCT CTTCACCTGT GAACCAATTA CTGTTOCCAG ATGTATOAAA 360 

ATGGCCTACA ACATGACGIT TITCCCTAAT CTGATGGGTC ATIATGACCA GAGTATT6GC 420 

GOGGTGGAAA TGGAGCATTT TCTXCCTCTC 6CAAATCTGG AATGTTCAOC AAACATTGAA 480 

ACTTTCCTCT GCAAAGCATT T6TACCAACC TOCATAGftAC AAATTCATGT GGTTCCAOCT 540 

TGTOGTAAAC TTTGTGAGAA AGTATATTCT GATTGCAAAA AATTAATTGA CACTTTTGQG 600 

ATCCGATOOC CIGAfSGAOCT TrGAATGTGAC AQATTACAAT ACTQIGA3GA GAiCTGTTCCT £60 

GTAACTTTTS ATGCACACAC AGAATTTCXT GGTCCTCSUGA AGAAAACSUSA ACAW3TCCAA 720 

AfiAGAjCATTG GATTTTGOTa TCX3U.aGCAT CTTAaOACTT CTOOGaC&UA. AfiOA^CAtCAAG 7B0 

TTTCTGGGAA TTGACCAGTG TGC3GCCTCCA TGCOCCAACA TGTATTTTAA AAOTGATGAG 840 

CTAGftGTTTG CATiAAAQTTT TATTOaRACA OTTTCAATAT TTTQTCTTTS TGCAACTCTG 900 

TTCRCATTCSC TTACTTTTTT AATTGATGTT AGAAGATTCA GATACCCAGA GAGAOCAATT 960 

ATATATTACT CTGTCTOTTA CAGt^ATTGTA TCTCTTATGT ACTTCA1!TGG ATTTTTGCIS 1020 

OGCGATAGCA CAGCCTGCAA TAAGGCAGAT GAGAAGCTAG AACXTGGTGA CACX3TTSTC 1060 

CTAGGCTCTC AAAATAAfiGC TTGCACOGTT rrGTTCATGC TTTTCTATTT TTTCACAATG 1140 

GCTGGCACTG TOTGOTBOGT QATTCTTACC A7TACTTGGT TC:TTAGCXGC AGQAAGAAAA 1200 

TG6A6TIGTG AAGCC ATCG A GCAAAAAJQGA GTGTGGTTTC A3GCIGITGC AaX3GGGAACA 1260 

CCAGGTTTCC IOIACTGTTAT GCTTCTraCT GrTQAACAAAG TTaAAOfiACSA CMkCnStCMSSf 1320 

GGAGTTTGCr TTA'XGAGCIG GATGCTTCTC GCl^CTTTGT ACTCTTGOCA 1380 

CTGTGCCTTT GTQTGTTTGT TGGGCTCTCT CTTCTTTTAG CTGGCATTAT TTCCTTAAAT 1440 

CATGTTOGAC AAGTCATACA ACATGATGGC OGGAACCAAG AAAAACTAAA GAAATTTATG 1500 

ATTOGAATTG GAGTCTTCAG OoaClTaTAT CTTOTGOCAT TAOTGACACT l^CrCGGAIGT 1560 

TAOGTCXATG AGCAAOTGAA CAGGATTACC TGGGAlQATAA CTTGQ G TCTC TQATCATTGT 1620 

OSTCAGTAOC ATAICCCAT6 TCCTTATCAG GCaUUU^GCAA AAQCrOSACX! AGAATTGGCT 1680 

TTATTTATGA TAAAATACCT GATGACRTTA ATTGTTGGCA TCTCTGCTGT CTTCTGGQTT 1740 

GGAACS^AAAA AGACATGCAC AGAAIGGGCT GGGTTTTITA AAOGAAATOG CAAGAGAGAT lOOO 

GCAATTCAQIG AAAGTCGAAG AGTACTACAG GAATCATGTG AGTTTTTCTT AAAGCACAAT 1860 

TCTAAAGTTA AACACAAAAA GAAGCACTAT AAftCCSUUSTT CACACAAGCT GAAG6TCATT 1920 

TCCMATCCA rcGGGAACXZAG CACAGQAGCT ACAGChAATC ATGGCACTTC rCGCAGTAfiCA 1980 

ATTACTAGOC ATGATTACCT AGGACAAGAA ACTTTGACAG AAATCCAAAC CICACCAGAA 2040 

ACA!rC:AATGA GASAGGTGAA AaOGGAaSQA GCTAGCACCC OCAOGTTAAG AGAiVCASGAC 2100 

TGTGGTGAAG CIGCCTCGCC AGCAQCATCC ATCICCAGAC TCXCXGGGGA ACASGT06AC 2160 

GOQAAIGGOCC AGGCAGGCAO TGTATCTGAA ASTGOGOSn GTGAAGGAAG OAT13UXCGCA 2220 

AAGACTGATA TTACTGACAC TGGCClGOCA CAQAGCAACA ATTTGCAGGT CCOCAGTTCr 2200 

TCAOAACCAA GCAGCCTC7VA AGGTTCCACA TCTCTGCTTC TTCACCCAGT TTCAOSAGTC 2340 

AGAAAAGAGC AGGGAQGrOQ TTGTCATTCA OATACTTGAA GAACATTrTC TCTGGTTACT 2400 

CAaAAf3C3U^A TXTGTGTTAC ACTGGAAGTG AGCTATGCAC TGTTTTGTAA GAATCACTGT 2460 

TAGQTTCTTC TTTXGCACtT AAAGT3GCAT TGCCIT^CTGT TATACIGGAA AAAATAGAGT 2520 

TCAAGAATAA TATGACTCAT TTCACSU^AAA GGTTAATGAC AACAATATAC CTQAAAACAG 25B0 

AAATGTGCAQ GTTAATAATA TTTTTTTAAT AGTGTGGGAG GACAGAGTTA GAGGAATClT 2640 

CCTTTjCCTAT TTATOAAGAT TCTACTCTTG GTAAQAGTAT TTTAAGATOT ACTATGCTAT 2700 

TTTACCTTTT TGATATAAAA TCAAGATATIT TCXTTGCTGA AGTATTTAAA TCTTATOCTT 2760 

GTATCITTTT ATACNTATTT GAAAATAAOC TTA<IATGTAT TTGAACTTTT ITCGAAATCCT 2820 

ATTCAAGTAT TTTTATC3VIG CXATTGTGAT ATTTTAGCAC TTTGGTAQCT TTTACACIGA 2860 

ATTTCTAAGA AAATTGTAAA ATAGTCITCT TTTAXACTGT AAAAAA2U3AT ATAOCSlAAAA 2940 

GTCTTATAAT AGGAATTTAA CTTTAAAAAC CCACTTATTQ ATAOCTTACC ATCZAAAATQ 3000 
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TSTGATTTTT ATMSXCIGGT TTTAGGAATT TCAOtfaATCT l^AATTATBTA ACTGAAATAA 
SGTGCTTACT CAAAGAGTGT OCACTATTOA. TTGTATTATQ CTGCTCACTG ATCCTTCTGC 
ATATTTAi!^ TAAAATGTCC TAAAGGGTTA OTAjGACftAAA TGTTAGTCTT TTGTATATTA 
G6CCAAQTGC AATTGACTTC CCTTTTTTAA TGTTTCATGA CCACCCATTG ATTGTATTAT 
AACCACTTAC AGTTGCTTAT ATTTTTTGTT TTAACXTTTG TTTCrTAACA TTTAORATAT 
TACATTTTQT ATTATAOUST ACCTTTOTCA GACATTTTQT AO 

Seq ID NO: Ci70 DMA Se<3uence 
Nucleic Acid Accession. #: NM_OQ05B2 
Coding sequence! 88-990 

1 11 21 31 41 51 

I 1 1 I t I 

GCAGAGCACA GCATCGT03G GACCAOACTC arCTCAGGCC AGTTGCftGCG TTCTCAGCCA 
AACQGC6ACC AAGOAAAACT OWaSiCCATG AGAATTGCAi3 TQATTTGCTT TTGCCTCCTA 
QBCKSCMCCT 6TGCCATACC AGTTTkAACAG GCTGAXTCIG GAAGTTCT^ QQAAA2)fiCAG 
CTTTACAACfc AATACOCAGA TGCT6TGGCC ACATCGCTAA ACCXTTGACOC ATCTCAGAAa 
CAGAATCTCC TAGC5CCX3«». GACCCTTCCA AGTAAGTCCA ACGAAAGCXA TOAOCACATG 
GATOATATOQ ATGATGAAGA TGATGATGAC CATGTGQACA GCCflfiGACTC CATTGACTCS 
AAOSACTCrCG ATGAT6TA6A TGACACTQAT QATTCTCACC AGICTGATGA GTCTCACCAT 
TCTCaVTOAAT CIQATQAACT G6TCACXGAT TTTCCCACQG ACCTQCXMC AACOSAAGTT 
TTCACTCCAG TTGTOCCCAC AGTAQAChCA TATOATGGCC GAGOTGATAQ TOTGOTTTAT 
GGACTGAtaOFT CRAAATCTAA GAAGTTTCGC AGAC5CTGACA TCCRGTACCC TGATGCTACA 
GAlOeAGGACA TCACCTCACA CATGOAAAGC aiiGGAGTIGA ATGGTGCATA CRflGGGCATC 
CXSCOTTOCCC flOGACCTGAA GGOSdCTTCr GATTGCJCaACA QQCUrGGGAA GgACROTTAT 
GAAAOGaUaTC AGCraaATOA GCIUaAGTGCr GAAAiCCCACA GCCACAAGCA GfTOaGtOTA 
TATA2U3CGGA AftGCCAATGA TGAGAGCAAT ISAGCKXTCCa Al?GrGATTGA TAGTCAGGAA 
CTTTCCWW? TCaQOOGTGA ATTCCaCMC CATGAATTTC ACAGCC3VTGA AGATATGCT6 
GTIGTAGACC CCAAAAGTAA GGAAGAAQAT AAAGACCTGA AATTTCGTAT TTCTCa*EG2iA 
TTAGATAGTO <a.TCTTCT6A GGTCAATTAA AAGGA6AAAA AATACAATXT CTCACTTTGC 
A'mrAGICAA AA6MAAAAT GCTTTATAOC AAAATGAAAG ASAACAT(2AA JOaOE^CTeX 
CTCAGTTTAT T6GTTGAATG TGTATCTATT TGAGTCTGOA AATAACTAAT GTGTTTQATA 
ATTAGTTTAG TETGTGGCTT CATGGAAACT CCCTQTAAAC TAAAAGCTTC AGOQTTATOT 
CTATGTTCAT TCTATAGAAG AAATGCAAAC TATCACTGTA TTTTARTATT TGTTATTCTC 
TCATGAATAG AAATTTATGT AGAAGCSUUVC AAAAIACTTT TACCCACTTA AAAAQAQAAT 
ATAACATTTT ATflTCRCTAT AATCITTTGT TTTTTARGTr AOTQTATAM TTOTTGTGAT 
TiOCrrsTEG TOGIGTGAAT AAATCTTTTA TCrXGAAIGr aataagaatt tggtggtgtc 
AATTGCTTAT TTGrTTTCOC ACG6TTGT0C ASC3UVTTAAT AAAACATAAC CTTTTTTRCr 
GCCOAAAAAA AAAAAAAMA AAAA 



3060 
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Seq ID MOi CI"?! DffiEA Seguexice 
ISUolelC Acid AcceBBion #: NM_002821 
coding sequences 150.-3362 

1 11 21 31 41 SI 

t 1 1 1 I I 

AACTCCOGCC TO5G6AC3GCC TCaOGSrCQG GCTCOGGCTG OGGCTGCTGC TGCGGOGCOC 60 

GGQCICC3GQT GOSXCOSCCT GCTGT6000G OOSroQAGCA GTCTGOaGOC OGCCQTSOQC 120 

OCrCAGCTOC TTTTCCTGAO CCXS^COjCSeA TGGGA6CTGC G06GGGATCC COSGCCftGAC 180 

CCaSCOQGTT GCCICTGCTC AGOGTCCTGC TGCTGCCJOCT GCTGOQCGGT ACCC7«3ACfta 240 

CCATTGTCTT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAGOOGCGC; CX3GGGGCTGC 300 

TTCGCTGTGA GGTTGAGGCT GCGGGCCXX30 TACKTOraiA CTGGCXSCTC GATGQOGCCC 360 

CTGTOcaGGA cAasaftsscxao asmcseocc agggcagcrg ccigaocttt gcagctgtgg 420 

ACCGGCtGCA GGACTCTOOC AOCTTOCAQT QTOTGGCTGG OGATGATGTC ACTGQAQAAQ 480 

AAOCCOTCAQ TGCCAACGCC TCCTTCAACA TCA3\ATGGAT TOAOOCAGGT CCTGTQGTCC 540 

TGAAGCAIGC AGGCVOGOAA GCTGAGATCC AC30CAC2U3AC CCAG6ICACSV CTTOQTTGCC 600 

MOLTTOKTOa GCACCCTOSG GCC2ACCTACC AATGSTTCCX3 AGWrGGGACC GCGCTTTCTG 660 

ATGQTCABAG C2UUX3^CACA GTCAGCAOCA Af3GAGC3GGAA CCIGAGGCTC OGGCCAGCre 720 

tyrcCTGaaCA TnfiTGGGCTG TATTOCTGCT G0GCX3CACM3 TGCTTTTGGC CAGGCTTGCA 780 

GCAGOCAGAA CTTCAOCTTC3 AiSCATTOCTQ ATGj^AAGCTT TGOCAGQOTG OTGCTdGCAC 840 

cccaaaacGT qgtagtagqg aggtaxgagg aggccatott ccsmxsccAG ttctcagcoc soo 

AGCCACOOOC GAQCCTGCIU3 TGGCTCTTT6 AGGATGAGAC TGCCATCT^ AAlGGGCASTC 960 

GCCCXXXACSL CCTOOGCAGA GOCACAOTtTT TTOOCAAOGG GfCCCTGCTG CXGACCCnQQ 1020 

T0CX9GCCACB CAATQCauSGG ATCTAOGSCT GCATTGGGCA GGGGCnOAQG OGOCCACOCA 1080 

TCATCCTGGA T^CCACACTT CACCTAGCAG Ai3RTTGAAGA CATGCOGCZA TTTOJUaCCAC 1140 

GGC^QTTTAC AGCIQOC:;^ GAGGAGOGTG TGACCTQCCT TOCCClOChAG GGTCTGCCAG 12Q0 

-AGCGCAGGGT GTGGTGGGAB CAGOCGGSAQ TGGGGCIGCC GAOGCATOGC AGGGTCTACC 1260 

AIBAAQGQCCA C33AfiClGGT6 TTGGOCAATA TIGCTGAAAG TOATacrGGT GTCXACAOCT X32D 

GOCA06CGGC CAAOCrGGCT GGTCAGCGGA GAlCAGGATGT CAACATGACT GTQOCCACTG 1380 

TGOCCTCCTO GCTQAAGaAG CCXXaAGACA laCCflaCTGGA GOAQOCaCAaA OGCGGCIACT 1440 

TGOATIGOCX GACCCA6GOC ACAOCAAAAC CTAOUSTTQT CTGGIACABA AAGCAGATGC 15O0 

TCAICTChSIA QGACrCAOGG TTOSAGGIGT TCAAGMWIGG GACCTTOCGC ATCAACAGGG 1560 

rFGGASGTGTA TGATGGGACA TGGTACOQTT GTATGftCSCAG CAC!CCCAGCG GGCA8CATOS 1620 

AOGCGCAAGC COGTGTCCAA GXGCrtGGAAA AGCTCAAGTT CACAOCACCA OCCCftGCCAC 1680 

AGCAGTGCAT GGAGITTGAC AAGOAC3GCCA CGXSEGCCCTQ TTCAGOCACA GGCOGAGAGA 1740 

AGCCCACTAT TAAGTOGQAA CX9GGCASA1!G GGAGCAGCOr COCAGAQTQG GTOACAGACA 1800 

ACGCTGGGAC OCTGCATTTT QCCCGGGTGA CrCQAGATGA CaCTGGCIUU: TACACTTGCA IB 60 

TTGCCTCCAA 0SGGC05CAG GGCGAOATTC GTOCOCAIGT OCA18CTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAfSTGQAA QCAGAGOGTA OQACTOTOTA CC3U5GGCCAC ACAGOCXTTAC 19B0 

T6CAG:rGCX3A GGCXX^UGGGG GACCCCZ^C OGCTGATTCA GTOQAAAGGC AAGGACCSSCA 2040 

TCCTQOACGC CACCAAGCTG G6ACCCAGGA TGCACATCIT CCAGAATQGC TCXXISGGTOA 2100 

TGCATOAOST GQCCCCTOAG GAClCAGGOC GCTACAOCTG CATTGCAGGC AACAGCTGCA 2160 

ACATCAAGCA CAOOGAGGCC COOCTCXATG TCGXrOGACAA GCCIGTGCCG OAGGIUXrGGG 2220 

ASGGCCCTGO CAGCCCTCCC CCCTACAAGA TGATGCKBAC CATItSGGTTG TGOGTGOGTG 2260 

GCGCIGTOGC CTACATCATT GCOGTGCTGG GCClCATOTT CTACTGCSkAQ AAGGGClGCA 2340 
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AAGCCAAGCX3 
GAGGGCCTTT 
GCTTGQGCrC 
TCCCACGGTC 
TCCTGOCAAA 
GCCTGCUSAC 
GGAAGCTGAA 
ACrrACATGGT 
AGAGCAAGGA 
GCACCCAGGT 
TO6CIGCG0G 
TCAGCAAGGA 
GCTGGATGTC 
GCTTOaGTQT 
CAGATGA.76A 
GCEOOCCTTC 
G60CCTCCIT 
GAGQAQGOAQ 
CAGCATGATG 

GGCTGACTTG 



AGGGTTAATG 
AClVCAGCAftS 
OCCCACOCTT 
CTTTTGJyCAC 
TGCAaCGTGG 
GCCATCCTTA 
TGTTTTTQTT 



GCTGCAGAAG 
GCASAA0QG6 
OGGCCCGGCQ 
TAGCCTGCAG 
OGCrCAGGGC 
GAABOATQAG 
CCAOGCCAAC 
OCTGGAATAT 
TG3\AAAATTG 
AOCCCTGGGC 
TAACTGCCTG 
TGTGTA^AC 
CCCCGAGGCC 
eCTGATGTGG 
ASTACTGGCA 

CAGTG7VGATT 
(XCXJCTCAGQ 
GGCAAGATCC 
CTGAGCftGGG 
GAOCCAAACT 
ATCAGGGACA 
GA0C3GGGTCC 
TOAOcrrGGOfr 

AGTCTCTTGC 
TQAC3TCCTCC 
CTCTCCTTTC 
TATAXAAACC 
QQTGQOTGGO 
CCGCACACTT 
TTTACACrC» 



CAGCCOGAGG 
CAGCCCTCAS 
GCCACCAACA 
CCCATCACCA 
TTGGAGGAGG 



6TGGATCTG6 
AAGTCACAGC 
ATGGAGCACC 



GOGAGGAGOC 
CAGAOATCCA 
AACGCCACAG 
caCTGOGGAA 
GAGTGGCAGA 
TOaACTTGCG 
TCCIOGGSCr 
GAGACCTCAA 
CCCTCAGCAC 
TGTOCAACAA 



AGAGATGGAA 
AGAAGAA6TG 
CACAAGT6AT 
GAGTaAGTTT 
GACOCTGGTA 



AGTGAGTACT 
ATCCrOOAGG 
GAAGTGTTTA 
GATXKKIASS 
COGCTGATGC 
GOC3)O0GCCC 
ATGGOCTGGG 
CTGTCCTCCT 
CCIGGCCTTT 
GCaGGQftCTAlD 
GTGTGGGTGC 
AACTCTQCCA 
TTGTGOGGAG 
CCACT06TCC 
CCACTCTGCaO 
CTCATGCTAA 
GCECTTTTTa 
CATGGGAGGT 
Xl^A^rrGTTGT 
CIGCTC7CAA 



ACCACTTCOG 
GTGACTTCTC 
CACATGGAGA 
CTOQQAAGQC 
AGCGCTGCIG 
TGOOAGACAG 
CA6GG6AGGA 
GOGCCCTGAS 
CCTOCTCTTC 
GGCTTTOAOC 
CACAG6TAAC 
CTCATCIGCC 
TrOCTTAATA 
ACTTGGGOGT 
CTTOTOCACA 
GTGCCIGGCA 
TATOCSVCCAC 
A6GGGT0GGC 
C5GTTTTTTGT 
TAAATAAOCC 



GTGCXXaGGAG 
GCRGTTOCTQ 
CA^CAGAAQ 
COGCTTTGTG 
GAAGGTGTCI 
CCAGGCCTGG 
TACCAAGTCT 
GATGCCCCAT 
TAGACTTCCT 
G6C0CTCAGC 
CACCGTGGAC 
CATCTCTAGA 
GTGCCCTAGT 
CTCAGCCTCA 
TOOGCAGTrr 
GCCAATTTCT 
AACTTTGCCT 
TTCTCAAGTr 
CTAGACCAGG 
GTOACCCABA 
GATGAASGAS 
GGGCGGCTTT 
OCTGGAGATG 

TTTTTTA 



TGCCTCAACG 
OCCTTGACCA 
AAGATGCACT 
GOGGAGGTGT 
CTTGTGAftGA 
6AK3ATGITTG 
GCTGAGCOCSC 
AGOATTTCXZR 
GTGGCCCTAT 
CATAAOGACT 
GCCCIGGGCC 
6TGCCGCTGC 
GATGTCTGGG 
GGXGGGCAGG 
CAGCCC3GAGG 
CCCAAGGACC 
AGCAAGCC3GT 
GGGAAGCTCA 
GC AACAGG CA 
TCCTTTGGGA 
OCCC36CCAC 
GGCCTTCAAC 
GGGGAGGGCT 
CTGGGCAC&C 
ATTATAGAGG 
CCCACQTCTT 
TTTTCAGGAG 
TATATGTAAT 
AGGAGG6TGG 



Seg ID NO I C172 DNA Sequence 
Nucleic Acid Acceeslon. lilM_002309.2 
Coding aequence^ 65 --673 



5X 



2400 
2460 
2520 
2SaD 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
33 OO 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
37QO 
3840 
3900 
396Q 
4020 
4080 
4140 
41B7 



1 11 21 31 41 

t i I 1 I I 

ATOAACCTCT GAAAACTGCC GGCATCTGAG GTTTCCTCCA AGGCCCTCTG AAGTGCAGCC €0 

CATAATGAAG GTCTTGGCGG CAGGAGTTGT GCCC3Cl?QCTG TTGGXICTGC ACTG(3A2UUCA 120 

TGOGOCGGGG AGCCCOCTOC CCATCACCOC TGZCAAOGCC ACCTGTGOCA lAOGOCACCC 180 

ATGTCAGAAC AACCTCAIGA ACCAGATCAG CaAOCCAACTO GCACAGCTCA AIGGCAGTGC 340 

CAATGGCCTC TTTATTCTCT ATTACACAGC GCAGGGG6A6 COGTTCCXCA ACAACCTGGA 300 

CAAGCTATGT GGGCCCAA£3G TGAiC36GACTT CCGQCCSCTTC CACCSCClAAlDa GCACQQAGAA 360 

GGCCAAGCIG GTGGAGCTGT ACX:GCATAGT GGTGTACXn^ OGCACCTCCC TGGGCaU^T 420 

CACCOGGGAC t^AGAAGATCT TCAACCCCftG TGCOCTOVGC CTCCACAGCA AGCIXMOGC 4 SO 

CACCGCCGAC ATCCTGCGAG G0CTCCXTA6 CAACBTQCIG XGOCGCCTGT GCa^CRAGTA 540 

GCACGTOQGC CATPGrGGACO TGACCTACGS GCCTOACACC TCGGGTAAQG ATGTCrrTCCA 600 

CSAAOAAGAAG CTGGGCrGTC AACTCCTGGG GAAGTATAAG CflGATCATOG OOGTGTTGGC 660 

CCAGGCCTTG TAGCAGQAGG ICXTGAAGOX? TGCTGXQAAC DQAGGCaATCr CAOOAOTTQa 720 

QTCCAGA70T GGGGGCCXCT CXAAGGGTGG CTGGOGCGCA GGGGATOGCT AAAOCCAAAT 780 

GGGGGCTGCT GGCAGAC!CCC GAGGSSGCCT GGCCSUSTCCA CTCCACTCIG GQC!TGGSCTG 840 

TGATOAAGCT QAGCAGAOTG GAAACTTOCA TAGGGAGGGA GCTAGAAGAA GGTGOCCCTT 900 

CCTCTGQGAO ATTGTGGACT QQQSAGaSTG GGCTGGACTT CTGCCTCTAC *rTGTCCCrrT 960 

GGCOGCTIGC ^TCACXTTBTQ CAGT3AACAA ACTACACAAS TCATCIACAA GAGCCCTGAC 1020 

CAawaOGXGA GAGAiGCAGGG OCCAGGGGAS TGGAOCftGCC OCCAGCAAAT XATGAOGftTC 1080 

TGTGCXHTTG CTGCSCCCTTA GGTTGGGACT TAGGTGGGGC AGAGGGGCTA GGATCOCAAA 1140 

GGACTCCTTG TCCGCTAGAA GTTTGATTGhG rrGGAAGAtTAG AGAGGGGCCT CIGGGATGGA 1200 

AGOCTGTCTl! CiTTTGACacaA TGATCAGAGA ACTTGGGCAT AGGAACAATC TGGCT^GAAGT 1260 

TTCCaOAAGQ AGOTCACTTG GCATTCAGGC TCTTGGGGAS GCAGAGAAGC CAiOCTTCAGG 1320 

GCIGGGAAG6 AAGACACTGG GAGGAGOAQA GGCCTGGAAA □CTTTOaTAS GTTCITCGTT laflO 

crerTCCcc3G tqatcttccx: tgcagoctog gatqgccagg gtctgatgcc tggaoctqca 1440 

GCAGGGGTTT GTGGAGGTGG GTAGQGCAGG GGCiAGGTTGC TAAGTCAGGT GCAGAieGTTC 1500 

TGAGGGRCCC AQGCTCTTCC TCrGGGTAAA GGTCTGTAAQ AAGGGGCTGG GGTAGCTCAG 1560 

AGT2U3CAGCT CACATCXGAG GCOCIGGGAG GTCIIGIGAG GTCAGAOUBA GGrACTTGAG 1620 

.GGGGACTGGA GGCOGTCTCT GQTCOCCAGG GC3VA0G0AAC AGCAGAACIT AGQGTC^GGG 1680 

TCTCAOGGAA COCTGAGCTC CAAGG6TGCT GTGGGTCTGA CCTGGCSVfGA •JTJCTA'lTJIA 1740 

TTATGATATC CTATTTATAT TAACrTATTG GTGCTTTCAO TGacCAAGTT AATTOOCCTT 1800 

TGCCIGGTCC CTACTCAACA AAATATGATG ATGGCTCCGG ACACAAGGGG CfiGGGCCAGG 1860 

GCTTAOCSUaG GCCTOGTCTG GAAGTOGACA ATOTTACAAG TGGAATAAGG TTAOS6GTGA 1920 

AGCTCAGAGA AG08TGGGAT CTGAGAGAAT GGGGAGGGCT GAGTaOGAGT QHSaOOaCCrr 1980 

GCtCCAGCXZC CATCC3CCTAC TGTGACTTBC TTTAGOGTGT CAGGGTOCAG 6CTGCA0GGG 2040 

CTGGGCGAAT TTGTGGAGAG GCOGOGTGCC TTTCIG7CIT GCTTCCAGGG GGCTGGTTCA 2100 

CACTGITCTT GGGCGCCCCA GCATTGTGTT GTQAGGOGCA CTGTrCCTGG CMSATATTGT 2160 

GCCCX3CTOGA GCAGTGGGCA AQACAOrCCT TGTGQCCCAC CCTQTCCTTa TTTCTGTQTC 2220 

COCATGCraC CTCTGAAATA GCGCCCTGGA ACAACCCTQC OCCTGCACOC AGCATGCTCC 2280 

GACACAGCAG GGAAGCTCCT GCTGTGOOCX: GGACACOCAT AQAOGGrGCG GGQOaCXrrGG 2340 

GTGGGCCAGA CXZCCAGGAAG GTGGGGTAGA ClOGGGGGAT GAGCTGCCCA TTGCTCCCAA 2400 

GAGGAOGAGA GGGAGGCTGC AGAOSOCIGB OACTCAGACC AGGAAGCTGIT OG6CCCTCCT 2460 

GCTCXSU^CXXr CATCCCAOTC CCAOCCA3X3T CTGGGCTOCC AGGC3«3GaAA CCCGATTCTCT 2520 

TCCTTTGrGC TGGOGCCAG6 aSAGTQGAGA AAOGCCCTOC AGTCTGAGAG CAGGGGAGOG 2580 

AAGGAGGCAG CflGAGTrOOQ GCAGCTGCTC AGAQCftlSTGT TCTGGCTTCT TCTCAAACCC 2640 

TGAGGGGGCT GCGGGCCTCC AAGTTCCTCC OACAAGA1GA TGGTACTAAT TATGGrACTT 270O 

TFCACTCACT XTGCACCTIT CCCTOTOGCr CICTAAGCAC TTTACCTGSA tGGCGaOTGG 2760 

1307 



wo 03/042661 



PCTAJS02/36810 



5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



GCAGTGTGCA 
CCATCTGTCC 
CCCTTGTCCA 
CTGGCTGC3VT 
GCTGATTOCA 
GGAGCTCTGT 
GGACTTCGCC 
CGTCCACCCT 
CCGGGAGCTG 
OGGAOGAAOA. 
TTTTTTTAAA 
Cn^TGCCTQ 
CTCTCCAGAG 
CCCltSCAGAT 
COGCATCCAG 

GAGsrrrccc 

TBMTGIAATGT 
CATATTTAAR 
GAGAAATGAA 



GGCAGGTCCT 
ATCCCAACAG 
GOATGCAGGG 
TCCCCX3U3GA 
CCCQGGGGGC 
GOCAGGCGCA 
CCATGGGAOC 
GCRCAQCATC 
GGTTTCTCTT 
QAlQAGTTTAC 
AAGCACTGCT 
ACTTBQOaCA 
TAGACATAGT 
GGTACAOATG 
CCTTCAGTCC 

ACOCT6T6G6 
ATATTTATTT 
TAAASMVTCT 



GRGGCCTGGG 
CAAGACGAGS 
ACTGCCTTCT 
TOQGCTTCQA 
CCGGCIGACr 
CCTTOGGaCXZ 
CATCCTCAGT 
ACTGAATCAC 
CCXZTTTITAT 
CAAGA<3AGAT 
AGTTTACTTG 
CTTCCACCC7 
GTGTGGGGTT 
TTCCTGCCTT 
CGCCCACGT6 
CCACTBAAAA 
6AT6TTTCAT 
TTTATACCAA 
ACTCTTOG 



GTTGGGGTG6 
ATGTGGClYaT 
CCTTCCTGCT 
GAAAGACAAA 
CGCCCATCAC 
CTGGCTCTGA 

AGAGCCTTTG 
CTGCTGGTGT 

TCrClCCTOC 
GACOCAGCCA 
GGaySCTCTGG 
AGAGTCATCT 
CTAGCTGCGT 
GCACATGOCC 
ACXGAC&GAT 
ATQAATCACT 



AQGGTGC6GC 
TGAOATtsTGG 
TCATCCGGCT 
CTTGTCTGGA 
CTCATCTCCC 
GTCGCTCTCC 
GATCC30C3TCC 
COTGAAACAG 
GGACCACACX: 
CCTTATTTAT 
CCATGGTCGC 

CACOCGGOGA 
CTAOTTCCCC 
GGGOCCACGG 
TTOaGTQACA 
TATTTTTATT 
TTl " rJ ' i " l ' iTA 



CCGGAGTTGT 
GGCACACTCA 
TAGCTTGQGG 
AACCAGAGTT 
TGTGGACTTG 
CACOCAGCCT 
GGCAGCTTGG 
CTCTGCCAGG 
TGGGCCTGGC 
TATTTAAACA 
CATCSTCCrC 
GCCTTGOOGG 
GGXAGCAITT 
ACCTCAATC3C 
TGOGGCCTXA 
AATTCCTCTT 
TATTCAATGT 
AGAAAAAAAA 



SeQ ZD NO: CI73 DNA Sequence 
Kuoleic Acid Accession #s XM_a975aB 
Ceding QequencQ: 44. .2768 



2820 
28d0 
2540 
3000 
3060 
3120 
31.80 

3300 
3360 
3420 
3480 
3S40 
3600 
3660 
3720 
37B0 
3840 
386& 



1 11 21 31 41 51 

1 I 1 I I I 

TOAAACGCGO TTGTaOTaCA AAGGAAAACC CACAGGCCAA GGAATGGGAA 6ACCAA0GTT 60 

GACACTTGTT TGTCAOGTGT C3iATAATCAT CTCTGCXXXSG GACCTCAGCA TGAACAAOCT 120 

CftCAGAGCTT CASOCTGOCC TCTTCCSICCA CCTGOGCTTC TTGOAQOAGC TXaOGTCICTC 180 

TGOGAACCAT CTCTCACACA TOCCA6GACA A8CATTCZCT G6TCICIACA GCCXGAAAAT 240 

CCIGAT6CIG CAGAACAATC AGCIGGOaOQ AATCC3CCBCA GAGaCQCTOT GGOASCTBCC 30D 

GiAGCCTGCAQ TOQCTGCGCC TAGATGCCAA OCTCATCTCC CTGGTCCCGG AGAGGAGCTT 360 

TGAGGtSGCTG TOCTCJCCTCC GCCACCTCTG GCTGGACQAC AATGCACTC3V CSSGAGATCCC 420 

TOTCAGGGGC CTCAACAACC TCCCTGCCCT GCAGGCCATG ACOCTGGOCC IXTiACCGCAT 4B0 

CM30CACATC CCCGACXACG GGTTCCAGAA TCTCACQ^GC CTTQTGGTGC TGCATTTGCA 540 

TAACAACC93C ATOCAGCATC TGGGGACGCA CaGCTTOSAG GGQCIGCACA ATCTGGAGAC 600 

ACIAGAGCTG AATTATAACA AGCIQCAOQA OTTOCCTGTG GCCATCOGGA CCCTGGGCAG 660 

ACTGCAGQAA CrGGGGTrOC ATAACAACAA CATCAAGGCC ATCOCAGAAA AGGCCTTGAT 720 

GQGGAACCCT CTGCTACMA DGfATACACTT TT31TGATAAC CCAATCCRGT TTGTGGGAAG 780 

ATGQGCATTC CAGTACCT6C CTAAACTCCA. CACACIATCT CTGAATGGTG CCATGGACAT 84.0 

CCAGGJUSXCT CCAXSUiTCTGA MtOBCMZCAlC CAGOCIOGAS ATCCTQACOC TSACCCOCGC 900 

AOaCATCaaG CTGCTCCCAT CGC3GGATGTG CCAAGAGCTG CCCAGGCTCC GAGTOCIGGA 960 

ACIXSTCTCAC AATCAAATTG AOGAGCTGOC C^CCTGCAC AGGTGTCAGA AATTOGAGGA 102 0 

AATOQGCCrC CAACACAAOC GCATCTGQGA AATTGGftSCT GACACXITTCA GCCAGCTGAG 1080 

CTCCCTGCAA OCCCTGGATC TTAGCrGGAA GGCCATO0G6 ICCATCXACC CGGAGOCCTT 1140 

CTCOICCCTG CACTCXXrrGG tTCAAGCllGGA CClQAiCAfiAC AACCAGCTGA CCIkCACSQaC 120D 

CCTGGCTGGA CTTGaGGaCT TQATGCATCT 6AAGCTCAAA OGGAAOCTTG CTCICTOCCA 1260 

GGCCTTCTCX: AAGGACRiGTT TCCC!ftAAACT GAGGATCCTG GRGGTGOCTT ATGOC!TACCA 1320 

GTGCIGTOOC TAIGGGATGT GTGCCSUBCTT CTTCAAGGOC TCTGGGCAGT GGGAGGCTGA 1380 

AQACCTTCAC CTTGATGATG AGGAGTCITC AAAAAGGCOC CIGGGCC70C nGOCAGACA 1440 

AGCAGAGAAC GACTATQACC AlOGAOCiaGA TOaGCTCCAS CtGOaGATGG AQGACTCAAA XSOO 

GOCACACCCC AGTGTCCAGT GTAGGCCTAC TOC2U3GCCCC TTCAAGCCCT GTGAGTACCT 1560 

CTTTGAAAGC TGGGGCATCC eOCTGQCCGT GTGOQCCATC OTGTTGCTCT CCOTQCTCTQ 1620 

CaVATGGACTG GTGCTGCTGA CCGTGTTCGC TGGCGGQOCT GTCOOCX3GC CCX50GGTCAA 1680 

GmOTGOTA GGTGOGATTS CAGGCSGGAA CACXmX3ACT GGCASTTOCT GTGGCCTTC:;r 1740 

AGGCTGAGTC GiVSGCCXSTOA CCTTTGQTCA OTTCTCIGAG TAGB6AGG0C GCTGGQAGRC ISOO 

GGGGCTAOGC TGCGGGGOCA CTGGCITGCT GSCAGTACET GGGTOGGAGQ CATOaOTQCT 1B60 

GCIGCICACT CIOGOOGCAS TGCAGTGCAG CGTCTOCGTC TCCTGTOTCX: GQGOCTATGG 1920 

QAAOTCCCCC TOCCTGGGCA GOGTTOG7U3C AGG6Ga:OCTA GGCTGCCTGG CACT^CAGG 1980 

GCTGGCCGOC: GCGCTGCCGC TGGCCZClnGT GGGAGAATAC G9GQCCICCC CACTCTQCXT 2040 

GGCCTAOaCS CCACCTOAGG GTCAGCC»GC AGOCXTDGGGC TFCA00GTG6 COCIGGTGAT 2100 

OATGAACTOC TTCTGTTTCC TGGTGGT6GC GGGTGOCIT^ ATCSUVACTGT ACITOTGACCT 21G0 

GCOGOGGGGC GACCTTOAGQ CCaTGTGGGA CTGGGOCAT6 GTGAGGCA06 TGGCCIGGCT 2220 

CATCTTOGCA 6A0GGGCTCC rCTACTGTOC OGIQGCXITTC CXCAGCrrTG CCTCCATCCT 2280 

GGGGCICXTC CCTGTCAiCGC GCQAGGOOGT GAAGTCTGTC CTGCrGGTGG T6CTG0CCCT 2340 

. GCCTQCCTGC CTCAACGCAC TGCTGTACCr GCZCITCAAC OCGCACCXCC GQ(SMSGM3CX 2400 

TOGGCGGCTT CGGCQ3C360S CAGGGGACTC AQGGCCCCTA GOCTATGCTG 0GGC0GQ6GA 2460 

GCTGQAflAAO AGCTCCTGTG ATTCTACCXX QQCCCTGGTA GCCTTCTCTG A^CGaJGGArrCT 2S20 

CATTCTGGAA GCTTCTGRAG CIGGGCGQCC CCCXGGGCTG GAGACCTATG GCTTCCCCTC 2580 

AQTGACGCTC ATCICCTGTC AGCftGOC!AGG GGCCCOCAGQ CIGGAG3QCA GGCATTGTGT 2640 

AGAeCCAGAO aOGAACCACT TTGGGAACCC OCAAGOCTCC ATGGATG6A6 AACTGCIGCT 2700 

GAGGGCAGAG GGATCTAOGC GAGCA£3Gieo AGGCn^TOTCA GGGGGTGGC3G GCTTTCAGCC 2760 

CTCIGGCTTG GCCTTTGCTT CACACHTGTA AATATCOCTC OCCATTCTTC TCTTCOCCTC 2820 

TCTTOCCTTr CCTCTCTCOC CCTC3GGTGAA TGATGGCTGC TTCTAAAACA AAXACAAC5CA 2880 

AAACTCAGCA GIQTGATCTA TAGCAOOATG GCXX:AGT00C TGGCTGCACT GATCACCTCT 2940 

CTPOCXGTGAC CATCftCCaAC GGGTGOCTCT rUGGCCTSQCT XlOOCTTGGC CTTCCTCAGC 30OO 

TTCACCTTGA. TACTGGGGCl* CiTeCTTGTC ATGTCTGAAS CTGTGGAGCA GAGACXZTGGA 3060 

CTTTTCTCTG CTTAAGGGAA ATGAGGGAAG TAZ^RflACnffr GAAGQGGTGG AGQGTTGATC 3120 

AGGC5CACAGT GGACAGGGAS ACJCTTCACftGA GAAAGCXKTG GAAGGTOATT TCCCQTOTGA 3180 

dXATOGATA GGATACAAAA TGTOTTOCAT GTAGC3VITAA TCTTGACATA TGCCATGCAT 3240 

AAAGACTTCC TATTAAAATA AGCTTTGGAA GAG 3273 



8eq ID NOs C174 DKEA Sequence 
HUdelc Acid Accession fts nm 130049 



1308 



wo 03/042661 



PCT/US02/36810 



coding sequence: xoi..2a44 
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80 



1 
1 

AGTCIGGOCC 
TGGCCACAGC 

TCTGCTGAGC 
GTTAAATAOG 

OCTSGftGGCC 
CGAGOaC&CC 
GGGCCTGCrC 

6GAGGAGGGG 

CTTTCTGTTC 

AGCCAGCAGC 
CAQOaTATQC 
<SGt£GMXXXSQ 



i 

CTGCTTCTGG 
CTtJCTCACCT 
CrGGCGQA.CC 
GCCCTGGGCC 
A8STA05TCG 
TOTGAGGACA 

GCTGCCGGCC 
6TGG6GGOGG 



21. 
1 

CA.GCAAAGCC 
GAGCftCtSCia 

CTGGCCAS3GO 
QTGTGCACTG 
TGGGCXSAacC 
GCX3GCXrrCA6 

jveecJcciGAc 

OOGCrCGGGG 



OGGGACCCTG TGOCCCTCAT 
CTQAiSTGOCA GGGAlGGTGAT 
GfUBGGCTGGG CGCAACTGAG 



TCTGTAOQGC TCOCTGGCCA CGCTOCTCAT 
GCTGACCTGC ACTQGCTOCft. GGGOSOTCOC 
OGCA6TGGGT GCACXCACTG GGGAOGCTGT 
GChTACACAC AOGGAAGM3G GCCtCSUSOGC 



CCCGGA6GAC 

ccacggtgtg 
gggctccogc 

caacttcgcc 

GGC!Cftt3CTO& 
CTTGCTOCAC 
CAiOGGCCTTC 
CTGGATCCTG 
GGOSAIGTIG 

CTGATAOCCT 
GCCCAGAAAC 



CTGGAjGGACG 
TtTCTGCAQC 
GCAGACCTGG 
CX3\6AiGTTGA 
GAOGGGCTGG 
CrrOGCCQTOT 
GCGOG6CT6T 

GCAlSCGGCCA 
AAAGmOOGS 
GGCTGGAOOG 
QGCCTAGTCC 
CAGAAGCXX:C 



GGCCCTGOGG 
TOQCACCCfta 
IGGOGGAGGA 
SGCTACIOCC 
CC6TGGG0GC 
TCTGCCAOOA. 
COGTGCGOCA 
ACOTGGCACT 
COQGCCTGTT 
ACOCGCOGCC 

cccacctttg 

TATAGAGGCC 



31 
I 

GOCCTCAGOC 
GGAGCIGAGT 
GGTGAC5GGDG 
OGCTCTOGAT 
CACCAACGQG 
TQAGGGOTCA. 
TGCQ8CC3SCC 
GCTCTGGGCC 
OCOSGGCCTG 

CAGXGCTG6C 

CCCTATGAjCG 

ccacastgajC 

CAGCTCCAGC 
GGCTGCATAT 
CC3CIGCCCTG 
GCAGGACCaO 
CTGCCTCTGC 
CCACTACATC 
CCXGCATCTG 
ACAGCCCftCC 
QAftCCTCTTC 
OCACAGCAGC 
CGAGCTCCOa 
GAGCCOOGAG 
CTAXKTGKTC 



GTTOCCA"C3VC 
AGCACIGCTG 
OGGQOTlQGft 
CCTCTACSXA 



41 
t 

AGCCCAGAAG 
ATGGOGXOOC 
AOGGCGTGCC: 
CAASAGGCTC 
CCGTGTGGAA 
GGGCTGOCCC 
6TCCTGTACC 
TCTCATGCBG 
AGCTGGCT6C 
GTAQATATCX: 
GGOGTCCTGG 
AGCCZCICAGT 
CTOGOOGA6C 
CRCRGTCATC 
AACAGCTCCA 
GOACTOTOQQ 
GTCCAACAGC 
CTCAGCCAGT 
GCGGTCTTTG 
CTGCAGACCT 
AGGCCCAAjC?G 

AATCTCCTGC 
CATAGCCACG 
CAGCX!CAAGC 
CTGCrGAAGC 
ACrCtGGGGS 
TGCTOCTGGA 
GAGCTGGGC3G 
CTGAACCTGG 
GTCAGOGAGG 
GCACXCTGGG 



51 



TAOGAGGATG 
ACTTAflfiATC CCACAC3CTCA 
CCASTCCCAA CTGCAGXAAA 
AA 



CACTGGGCCT 
TGGTCTCGCT 
OGCCTGCTGG 
TC3GGCGGCCT 
AGTGCCTGTC 
CG6GCCGGGT 
TCAGCMUXX: 
ACCAOCTCCr 
TGCAGAOGAT 
CTCRGCXQCT 
CTGCCCTGCT 
ACTTC3GTGGA. 
TGTCAGOCTT 
GGCACAGGGG 
GTGTGTGGGA 
AACAOGCTGO 
AGCTGAGTGG 
CAGAGAGGTA 
GCCTCCTGCT 
TCCTGAGCCT 
TGCrOGGGCT 
TGGCTATSCT 

TQcacsasQA 

GGOGCCACAG 
C3CCCXXACGA 
CTGAGOOCA0 
ACGCCXsTGCA. 
AGACOGGGCT 
ACTTCGCCGC 
CCTOCGOGCT 
AGAGCGAGGC 
ACATGCTCCC 
AC3\ACGIGG6 
ACATCAOCTT 
GAAACCTACA 
GACACTCTTG 



Seq ID KOs CL75 UNA, Sequence 
ElUcXeic Acid Accession #: NM_018971 
Coding sequence* X..1128 



1 
1 

ATGGCGAAOG 
AAGCTGGC^ 
CTGCTGATCXr 
TGGCTGdCCG 
OGTGOGGGGG 
CTGGCCX30GC 
TACCn:GGCCA 
GCCKTGCrGG 
GAX?GGCGGT6 
OCX3GG0QGGC 
TAOCTCC3GGC 
CC06C0GTCA 
AACTOGAOGG 
GCAG6GCGGG 

aggctgtgca 

0TCS7TGG0CA 
AOGGGCTCCG 
TTCRACAGGG 
ACCACGCAGG 



11 
I 

OGAGCQAGCC 
CGCTCAGCCT 
TGC3GGGAGCQ 
AOGGGCTGOS 
G0GGGGCGG6 
TCTTCTGCTT 
T0GO6CACCA 
IGTGOOCCaC 
GCGAGGAOGA 
TQGGCTTCCT 
TOCTCTTCTT 
G0CAOQACT6 
COfjEUCrrTOGG 
GCCGOGGGGC 
AGATBTTCTA 
GCTACCIGOG 
IGTGaCTOAC 
AQCTGAGGGA 
GSACCCA3*CC 



21 

I 

GGQGrGGCAQC 
GCTGCTGTGC 
CAGCCTGCAC 
CGOGCTOGCG 
GGG6O0GCCG 
CCft OgCOGC X: 
OGGCTTCTAT 

ciGaQOGCns 

GGAOGGGCGG 
GCTGCTGCTG 
CATOCAC6AC 
GACCTTCCAC 

GCGCOGC3CTC 
CGOO^TCACG 
GGT0CIG6TG 
CTTC6CGCAG 
CTGCTTCAGG 
CTOOQftCCTG 



31 
\ 

QGCOaCQGCG 
GTCAGCXTTAG 
GGGGCCCCGT 
TGCCTCOCJGG 
GGCGCGCrtGG 
TTCCTUCTGC 
GCAGAGGSOC 
GCGCTGGGCG 
TGGGGGCXGG 

OGCOGCAAGA 
GGCCCXK3GOG 
ACGCGGCCOG 



CTGCTCITCC 
CGGOCCGGCG 
GCOGGCATCS^ 
GGOCAGTTOC 
AAA09C3VTTG 



41 

I 

AGQCOGCCGC 
CGGGCAACGT 
ACTACCTGCrr 
CCGTCATGCX 
GCTGCAAGCT 
TGGOCGTGoa 
TGGCOOSCTG 
CGGOCTTCCC 
AGGAGOGGCC 
TGOOCSCCAC 
TGOSGOCCGC 
CCACX!OGCCA 
GGCTTfTFGGG 
AAGAATTCAA 
TQCTOCTCIG 
GC6TOOCCCA 
ACCCCGTCGT 



51 
I 

CCTGGGCCTC 
GCT<?ITCGCO 
GCrrCGACCIG 
OOOGGOGCGG 
GCrCGCCTTC 
<53TCACCCt3C 
GCGGTGGGOC 
GCCAGTGCTG 
OGAOSGCGGC 
GCACCTGQTC 
GOGGCTGGTG 
GGCGGCCGCC 
CATCX3GaCXX: 
GAOSSAGAAG 
GOGGCOCTAC 
GGCCTAOCTG 



Seq ID KO: C176 t)KA Sequence 
Nucleic Acid AccesBlon. #s 3fM_005631 
Coding sequences s .2653 



GGCACGAGGG 
OC5GTTOOGGO 
GCCGAGGGGC 
A£»aSTOSCC 
ASSOnSCAOGC 
C0G0CCAG05 
GGACCCGGGC 
GGG06GGAGC 
CGGCGGGGCT 
CTACGBGGCC 



11 
I 

OQCrrOAAGAC 
CCTCCGCAGC 
CGOGGCGGGC 
T^GAGCOGOCr 
GGG6GCGCOG 
GGCGG6C00G 
OGGGGGGCGG 
GCX3AGGAGGA 
GCCC0CTG06 
ACXrTCQUCAC 



31 
I 

AAjCTTGGATT 
CCAACATGGG 
G6AG0QTCCG 
COGGOGGOGC 
GGOCTTTTGC 

AGcrcccGcr 

CXTTCGAGOOO 
GOOCGQCSGGT 
ASCC6CTGG6 
TGCTG6CCGG 



31 
1 

GCGAGGCTAG 
OCCOGGGTTC 
GGaGCaCCX35 
OGAGGTOGTG 
TGAGTTaaCG 
CCTGGGGCTG 
GAACGOQACC 
GACIGGOCCT 
CXAOkACGXG 
AGACTOGGAC 



41 
I 

GGCTroaOGA 
CAAAGTTTGC 
GGCCCGGATT 



CTGCTGCTGC 
GGGOCTGGGC 
CGGOCZGCOGC 
TCCCTGGQCT 
TOCCAGGAGG 



GO 

120 

IBO 

24.0 

300 

360 

420 

480 

S40 

fiOD 

660 

720 

780 

B40 

900 

960 

1020 

1080 

1140 

1200 

12€0 

1320 

13 BO 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1.920 

1980 

2040 

2100 

2160 

2192 



60 

120 

180 

240 

300 

360 

420 

480 

S40 

600 

660 

720 

78Q 

840 

900 

960 

1O20 

lOBO 

1128 



51 
I 

GTCGTGCATC 
GAAGTTGGGC 
CTCTaOGCGC 
GSGGGCTOCG 
TOGOOaCTGC 
TCCTGCTGGG 
CTGGGAGGGC 
TOAGCCACTG 
OGGTGCTGCC 
AAG€X3Cncoa 



60 

120 

IBO 

240 

300 

360 

420 

480 

540 

600 



1309 



wo 03/042661 
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10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



caajsctcgtg ctctggtcgg goctocggaa tgccccccgc tgctgggcag tgatccagcc 
cctgctgtgt gccgtrtaca tqcccragtg tqugaatg^c cgggtggaqc tgcccagcc8 
taccctctgc caggccaccc gaggcccctg tgccatc5g3x3 oajsagggagc gcjoactggcc 

TOACTTCCTG CX3CTGCACTC CTGACCXSCTT CCCT(3A7^C TGCACQAATG AGGTGCAGAA 
CATCAAGTTC AACftaTTCftO GCCftffTGCGA AGTGCCCTTG GTTC3GGACAG ACAAOCCCAA 
GAGCTGGTAC GAGGACGTGG A0GGCTGCGG CATOCAGTOC CAiaAADCOGC TCTTCACRGA 
GGCTGAGCAC CAGQACATGC ACauSCZACAT CGGQGCCTTC GGGGCOGTCA CGGGGCXCTG 
CACGCTCTTC ACCCTGGCCA CATTCGTGGC TGACTGGCGG AACTCGAATC GCTACJCCTGC 
TGTTATTCTC TTCTACJGTCA ArGCGTGCTT CTTTGTGGGC AGCAXTGGCT GQCTGGCXCA 
GTTCATGGAT GGTGCCCGCC GAG2VSATCGT CTGCCGTGCA GATGGCACCA TGAGGCTTQG 
GGAGCCCACX: TCCAATQAGA CrCTC3TCClG CGTOVTCATC TTTGTCR.T06 TGTACTAOGC 
CCTGATGGCT GGTGTGGTTT GGTTTGTGGT CCTCACCTAT GCCTGGCACA CTTCCTTCAA 
AGCCCTGGOC ACX».CCTACC AGDCTCTCTC GC3QCAA6ACC TCCTACTTCC ACCTGCTCAC 
CrGGTCACTC CXXTTTIGTCC TCaCTGTQGC AATOCTTGCT GTGGOGCAOa TGOATGGOQA 
CrCTQTGAGT GGCATTTOTT TTOTGOaCTA CAA0AACTAC GGATACCGTG CGGGCTTGGT 
OCTGGCCOCA ATOGQCCTGG TGCTCATOGT OGGAGGCTAC TTOCTC^TOC GACaGAGTCAT 
ga£:tctgttc TCCATCAAGA GCAACCAiCCC CGGQCTGCTS ASTGAGAAGG CTGCCAGCAA 
6ATCAAOGAG ACCATGCTGC GCCTOGGCAT TITTGGCTTC CrGGCCTTTG GCTTTGTGC3* 
CATTACXTTTC AfiCTGCCACT TCTACGACTT CTl"C3iflCC&G GCTGAGTGGG AGCGC3VGCTT 
OCGGGACTAT GTGCIATGXC AGGCCAATGT GACCATOGGG CTGCCCIVCCA AQCAGCCCAT 
CGCTGACTGT QAQATCAAQA ATGGGCCGnG CCTTCTC3STG GAGAACSATCA ACCTGTTIGC 
CATGTTTGGA ACTGGCATCG OCATGAGCAC CTQGGTCTGG ACXSMWaOCCA OGCTGCTCRT 
CrGG2U3GC3GT ACCTGGTGCA GGTTGACTGG GCAGAGTGAC GATGAGCCAA AGCG9GATCAA 
GAAGACCAAG ATGATTGCCA AGGCCTTCTC TAAGOTGCAC OAGCTCCTGC AGRACCCfiGG 
OCSUSGASCTG TCCIICAGCA TGCACACTGT 6IOCCAOGAC GGGCCOGTOG COGGCTTGSC 
CTTTGACCTC AATQAGCOCT <3kjGCraATGT CTCXITCTSCC TSGGCC3CAGC ATGTCftCGKA 
GATGGTGGCT GGGAGAOC^ OCATACTGGC GCAGGATATT 7CTGTCAOCC CTGTOaCAAC 
TGCAflTGC!CC CXMACSGAAC AAGCCAACCT GTOGCTOGTr GAGGC3WaW3A TCTGOCCAGA 
GCTGCAGAAG CGCCTGGGCC OGAAGAAGAA GAGGAGGAAQ AGOAftOAAGG AGGTGtGCGC 
GCTGGGGCGX3 CCGCCT6AGC TTCACCXJOCC TGCCCCTGCC CCCA6TACCA TTCCTOGACT 
GCCTCM9CTO CCCCGGCAQA AATBCCTCar GGCTGCAOOT GCCTGGGGAG CTGQOGACIC 
TTGC3aGACAG GGAGCGTGGA GCCTGGTGTC CAAOCCATTC TGCCCAGA^SC OCM3TCCX3CC 
TCAGGATCCA TTTCTGCOCA GTGCACOGGC CCCCHTOGCA TBGGCTCATQ QCCSSCXXSACA 
OGGCCTGGGG CCTATTCACI COGGCACGAA CCTGATGGAC ACAGAACTCA TGGATGCAGA 
CTOGGACTTC TGAGCCIGCA GAGCAGGACC TGGGACAaaA AAOAQAGGAA CCAArAlCCTT 
CAASGCTCrr CTTCCTCAiCC GAGCATGCTT CGCXAGGATC GOGTCITCCA GAGAAOCXOT 
QGGCTGACTO CCCTOOGAAG AGAOTTCTGQ ATGTCTGG<^ C3\AAGC3UaC!A GGACXGTGGG 
AAAGAGCCTA ACATCICCAT GGG6AGGCCT CAOOCCAGGG AOUSGGOCCT GQAGCTCAGQ 
GTCCTTGTTT CTGCCCTGCC AGCTGC3U3CX: TQOTTQaCAQ CATCTGCTCC ATOGGGOCAG 
GQGGXATGCA GAGCTIGXGG TGGGGCAGGA A08GTGG2U3G CAGAGGTGAC ASTTCCCAQA 
' GTQGCX^&OGG AGGCAGOCTA GGCTATQTCT GOCAGAXGAlG GGCCGGCXGC 
' GCTGAXGGST GCCCTTTOCT GGCAGTCTCA 6TCCAAAAGT OTT0IACT9TG 
TCATTASTCC TTTGTCTAAS TAGGGCCAOG QCAOOTTATT CCTCTCOCftG 6XGTTTGTGG 
GGCTOGAACSG ACCTGCTOCC ACA6GGGCCA TGTCCTCTCT TAATAGGTGG CACTACCCCA 
AAOCCATCTT TTGTTCTGCT ATATGCTCCT TCTOCTGTTC CATTTCAOTT CAOTZTCAGC 
GGTGCCAAOC TCnTTQCSTT TOCrTTTOXST IGtOGRGGSiC GCAGAGCTGG T6CAC3U»CT 
CACCTCI31AG CCCCTOCCCT OGCTGCTGGG COOCATCTCC: ACAGQAGAOA CiaOTTOGGC 
TCTTiGGGCCr C2\GTCaX3GncS XGGGATAGGA GCA6T6A6TG AGAAAGCCTC TGAAA6ATGC 
ATCATCTCTT OCTCACACCC ATTTAGTGGG GaATGGGTCC TCTAQACTTG AGGGGCTACC 
CTGG6AAGCT GOCGrAGCXX GAGCCM3GCA AGAAAGCTTC CTTCAAOCIG CATAGCOGGT 

GdSTGnaaAG atxccxacct Tcamaccr ccaaacatoi tcgcsu^ggcg ccactttcaa 

GAATCAGACA GCAGGAAGOC AXAGATGCT6 GCTGGGTTGC aSOTTAXOGCI GAGAAQAAAT 
ACAGTCAATA AAAGOTTTTT CTATAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
A 

Scg ZD BOs C3L77 UNA Segaeflce 
Nucleic Acdd AC<3«aaion #i AK0M595 
Coding Bcquence: 1..2853 



11 
I 



CCXaaOQCTGA 
TCAGCGCCAG 
A aBAACAAOC 
TGCAA0GGG6 
AOCCTGGOaS 
CAGG7GGAGG 
GCGGGGACCA 
GATCAGGAGC 
CCGCCGGAGG 
OOCACCCAGG 
CQCCTGTOGG 
AGCACCACK5 
TCACCCTGCT 
CCCQCTCCAC 
ACCAOCATCr 
ACTGAGTBTG 
OGTGACTGGA 
TUVTAAGAAAA 
CTQTATGCGG 
GTGGTGGlt^ 
GCCCTGACTG 
CTCCTACACC 



GCCAAGCAGG 
CASASCJOGCT 
CTGTGGAGCT 
AGXGGGTCftp 
COOGCSGGQGS 
AGCTCTTTGG 
CCAAGAGTCG 
CTCXGGGCAA 

aScS^t 
acactgccaa 
ccaccgtcat 

CCAAOaaCTG 
TCAAaMAGQ 

gcccagtcga 
cccactggcg 
gcgogaogct 
ctctaagcga 
cx3ctcgtggt 
accgccgcaa 
6t6gtttoca 
octctqtgcc 



21 
I 

TGQGQGCGOG 
TAGGAAGCX3A 
GCJCCXACTTC 
CCGCTGCCSaC 
OCAGAAGGAC 
GCIGOSGGTO 
GCTGGAGGAT 
CCOAGGCTAC 
GGAGGTGGCC 
GGCCGnaGTG 
OCT6CTCAOC 
CTAIACCTGC 
OGTCXACGTG 
TGGOOGAoac 

oacxrcTCTGC 
TGGGGanoa 

TAGCOGCGAG 
GCTCGACTCT 
COOCAACAGC 
GGCCATCTTC 
CTGOCaGTOAC 
CCCCGTCAAC 

TcxnGaccra 



31 
I 

CVGCIGCrrGG 
TCGGGTOU3G 
CTGCAGGAGC 
GGCITCXXXX3 
CAGGTCACAC 



TACTGGTGOC 
GTCCQCATCG 
CTGGACCAIG 
OaATGSCrCA 
ATOGACCACA 
OTGGCCAAGA 
AAT6GCG6CX 
TCGCAGAAGC 
GAGGGQCAGG 



TGCAXOGOGC 
AAQAACTGCA 
CACCTGCTGG 

gtggtcgxgg 

TTCGACACAG 
!ETTAAGA06G 

acagccagcg 



41 
I 

cactgctgct 

TGCTCD^TQA 
CACAGGAOGC 
CCACACAGAX 
AG6AAG6CCT 
AGATOOAGeV 
AGTGOGTGGC 
CSCTACCTGOG 
AGGTTGTOCT 
AGAATOftCOA 
ACCTCATCAT 
ACATOGTGGC 
GGTCCAGCTG 
GCAGCOSQAC 
CATTOCAGAA 
GCAAGVGGIC 
OCCCACOCCA 
C3U3ATGGGCr 
AGGCCTCAGG 
CAATCCTCAT 
ACATCACTQA 
CAAGOCSCCAG 
GOGGCATCIA 



51 
I 



CTCCTTOCOG 
CTACATTGTG 
CTACTTCAAG 
GGATGAOGCC 
GXCGCGGGZU3 
CTGOAGCXCC 
CAAGAACTTC 
GCA6TGG0GC 
TGTCAICGAC 
GCGCCAGGCC 
CAAACGOCGQ 
GGCAGAGTOO 
CTGCACCAAC 
GAC06CCTGC 
AGCCTGCAGC 
GAAOGGA0GC 
GTGCATGGAA 
GGATGOGGOQ 
GGOG6TGGGG 
CTCATCTGCT 
CAAOCCGGAG 
COGCGGACCC 



660 

72 D 

780 

840 

900 

9€0 

102D 

lOBO 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1B6D 

1920 

19SD 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2seo 

2640 
2700 
2760 
aB20 
2860 
2940 
3D0O 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3781 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

loao 

1140 
1200 
1260 
1320 
1380 



1310 



wo 03/042661 
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5 
10 
15 
20 

25 

30 

35 
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45 
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65 

70 

75 

80 



GTOTATGCXX: TGCAGGACTTC CACCX3ACAAA ATCCCCATQA CCAACTCTCC 
CCCTTACCCA GCCTTAAfiGT CRAfiGTCTAC AGCTCCAGCA CCRCGGC5CTC 
CrOCCAGATG QGGCTGACCT QCTQGCSQGTC TTQCCaCCTG GCACATACCX: 
GCOOGOGACA CCCACTTCCT GCACCTOCGC AGCX3CCAG0C TCGGTTCCCA 
QGCCTGCCCC GAGACCCAGG GAGCRGCGTC ACgCGGCRCCT TTGGCTOCCT 
CTCAGCATCC CX3QGCACAGG GGTXIAGCTTG CTGGTGCCCA ATGGASCCAT 
AAGTTCTACG AGATGTATCT ACTCATCAAC AAGGCAQRAA GTACCCTCCC 
GGGACCCAGA CAGTATTGAG COCCTOGOTQ ACCTGTGGAC CCACAG6CCT 
GBGCCOGTCA TCCTCACCAT GCCCCACTGT GCG6AA6TCA GTOGCCGTGA 
CAGCTCAAGA CCCAGGCCCA CCAGGGCCAC TGQOAGSAGO ^GSTGACCCT 
ACCCXGAACA CACCCTGCTA CTGCCAGCTG GAGCCCAGGG CCTGTCACAT 
CAGCTOGGCA CCTACGTGTT CAOSGGOGAQ TCCTATTCCC GCTCAQCAGT 
CAGCTGGCCG TCTTCGGCCC CQCCCTCTGC ACJCTOCCTGG AGTACAGCCT 
TGCCTOGAGG ACACSOCCTQT AQCACTQAAG OAOGTGCTGG AQCTGGAGCG 
GGATACTTGG TGGAGGAGOC GAAACOGCTA ATGTTCAAGG ACAGTTACCA 
CTCXOC3CTCC ATGACCTCCC CCATQCCX^AT TQGAGQAQCA AGCTGCTGGC 
GAGATCCOCr TCTATCACAT TTGGAGTGGC AGCCAGAAGG CCCIXXACTG 
CTGGAlSAGGC ACAQCTTGGC CTCCAC3U?AQ CTCACCIGCA AGATCTGOGT 
GCCAiGATATT CCAGCTGCAT AOCACTCTGG CAGAQACACC 
CTOGACACTC TCTQCTCTOC CCCrGGCAGC ACTOTCACCA CCCAGCTGGG 
TTCAAGATCC CACX6TCCAT COGCCAGaAG ATATGCftACA GOCTAGATGC 
CGGOGCAATa ACTGQCaaAT GTTAQCACAa AAGCTCTCTA TGGACCGGTA 
TTTGCCACCA AAGCGAGCCC CAOGGGTGTG ATOCTGGACC TCTGGOAAQC 
GACGATGGGG ACX^-CAACAG CCOXSGOGAGT GCCTIGQAGG AGATGGGCA2L 
CrGGTOQCTG TGGCCACCOA C3GGGQACTGC TQAGCCTOCT GOGACAQCDQO 
ACTGGCAOGA GGCAGGTGCA GGGAGGCCTG GGGCAGCCTC CTGATGGOGA 
CTQCTTCCTC CCAQTTCACA QCCAOAOTTQ EICTCTCCTCC TCCTCTrTCCC 
AOCATGACCA GCCTTAGAAA ATCCATGTAC TCTGTTGrTA GAGGGCSXM 
CCACCCXXX3C TCTCTCTCTC TTGGCCTQAO ATC?TCTT3rGC AGGAACCAAG 
AGCCTCrrGGA GGCAGTTGGT TGGGGGCGGG CTVGGCAOGAG GCOCTCCX3C 
CGCTCAfiCCC GGCAACTTGT GGGrTCCATG GGTTTTAGTT GCGTTCTCGT 
CSGITATTGAT TTCTCCTTTC TCX7CTAAGCC (XCTTCTGCT TCCACGCGCT 
GAAGAGICAA GXACAATTCA GACAAAiCTGC TTTCTCClGT CCAAAAGCAA 
aAAAGTkAAiOA AAGCTTCAGA COGCTA6TAA GGCTCAAAGA AQAAGAAAAA 
ACAAGGGAAA AGAAAAACOC AGrTTCTTAQ GAAACGCAAA G6ATTTATTA 
TTGQATAAGT OCTTTTTAAA A 

Seq ID KOz CL7B I>SA Se^ence 
Nucleic Acid Accession It: NH_004S2S 
Coding seg^^ices 310.. 1359 



TCrGCTGGAC 
TGGGCCAGQC 
TAGOGATTTC 
GCAGCTCTTO 
GGGTGGGAGG 
rCCCCAGGaC 
GCTTTCAGAA 
CCTGCIGTGC 
CTS6ATC3TT 
OGATGAGGA6 
CCTOCrOGAC 
GAAGCGGCTC 
CS330GTCTAC 
GACrCTGGGC 
CAACCTGOSC 
CAAATACCAG 
CACITTCACC 

gosgcaagtg 
tgctgggtcx: 
acx:ttatgcc 

OOCCAACrCA 
CCTGAATTAC 
TCTGOUSCAa 
GAGTGAGATG 
GCTGGCA06G 
TGTTTGGCCT 
CAACCCCCAG 
AQTICCITCT 
ATGG08CTGA 
CACX3C3CC3C3CA 
TTTCTTCCTC 

TTTOCTCrrr 

AAAGGCAAAG 
CACCAAAACC 
TCCftGATTAT 



1 
1 

GC3VCCGGIC0 
CCTCCX33CTC 
C6CACCGACA 
TTOCCOOTOC 
ATOGGGACTA 
AT6GTCTA17C 
TGTAACAAGA 
GCCATCMCG 
GGCAATGGOC 
ATAlBTGGGGA 
GCX3VICACAG 
C7UIGGCC3VGT 
TAC3GaCAT0G 
ACTCTC&TGA 
CIGQAATGTA 
CXGCCACAGI 
GTGGAGOCTG 
CTGTCGTACC 
TGCGAOGAG6 

TAGG0006G6 
AOGXGCAQCG 
CCCGCTGCAA 
ASGASCGCTGA 
TCCXnCGOQG 
CTCCftCCCTC 
CAGGAACICX 
ATAATATXAA 



11 

L 

CTOGCGCGCA 
CGTGGCTCCC 
GACGGCCCCG 
GGGGC0G6CC 
TGAAOCGGAA 
TCC33ATCGO 
TOOCAGGOCT 
TCATAGGAQA 
GCTGGAACTG 
GC0QC3CawQGC 
CTGCCTOXAC 
A<X2bCOGGGA 
GCTTOGCCAA 
ACTTGCACAA 



TTOGGGAGCT 
TGGGTGOCAG 
GCAAI3CC3CAT 
ACCX3GGT6AC 
CXAOCOSCTO 
T6TGGCAGT8 
AGCGCACGGA 
GTCAGATTGC 
GCTTGTCTlT 
AAAAAAAATC 

occCAflArnc 



TTTATTTAAT 



21 31 

\ t 
AGGCAGCA0C GOCOGCX3CAC 

GcccoGcarc gccccaicscc 

GTGCTCCTGG OGAGGCTCAG 
6GGAX2GCCTC GGCTOGGQCC 
aSOGGGATCA GC ACftGCC OG 
AGGSGGSCXSC IGGCIGOSCX? 
TGGCTTCTOC TCAOTGGTAG 
OGCTCXXSGA CAGOGGGCGA 
AGGCTCACAA ATGOC^CCTGa 
CTCTGGACTG GGAGAGOSCA 
TV3QQTTC!ACC: XA£QCQ\TCA 
OC3W3GGCAAC CTGnSOGACT 
OGAGGGCCGG AAGTGGOGTG 
OGTCTTTGTG GATGOCOGGG 
ChACmGGCA eGCOGAAAlSA 
CSTGTCAGOC TQQTGraCCA 
GGSCrAOGTG CTCAAGGACA 
CCGCAACAAG OGGCCCACXZT 
GGACAOGGAC CTGGTGTACA 
GGGCAOTGOG GGCACCX^AGG 
TGACCXCATt? 3GCTQ-TG@6C 
CftACXGIAAG TTOCACTOGT 
GATGTACAOa TOCAAGTGAG 
TGGGAGGACT GGRCC6TTTC 
TCTGCroaiSG AGGGTACTTT 
TCTCAG2WGGC CICAACIATT 
AGOCCAGGTC CCTCOBOCPGC 
GCCICATCAG AGCAATATH 
TAAAAAGAAT TClTCC3hCAA 



TCCCCBCSTC 
GCGCTCGCTC 



51 

5 



CTCCCTCCCr 

C3GCGGACGGG 
GCTATGTTGA 

caacacxaoGCA 

CAGGCTGGGC 
AftQC!ATC!ATC 
COGGOCOGAC 
GITTCAGTTC 
GAAGGAGCTC 



CAAAOAOAAG 
OGACATOOGC 
QAATOCCCGG 
GAACAXGAAQ 
Cll3QA!aC3UCA 
G6O0CTTCAC 
CAAGAAGOCSl 
GCCCAACTAC 
CAACAAGACG 
CSICCCACCAG 
CAAGTGCAAC 
ACACCACCCT 
GCTOCCTGGC 
CCTGCAi3GCA 
CGCAKXCCTG 
CCTTCTQCAO 
TdGATftAAA 



Seq ID NOr CTTB DliiA Seqixence OBS3 
MUdeic Acid Accession is 1]M_003786 
Gbding seqaenoes 71.. 4 £54 



144 0 

1500 
1560 
1620 
1680 
1740 
1800 

leeo 

1920 
1980 
2040 
2100 
2160 
2220 

22ea 

2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2&30 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3501 



TOCCGGGOOG 
GOOGQOGOCC 
AOCrCITTCT 
CTUiVQGOGC 
TCTGOCAGAG 
ACGAGTGTCA 
COGTCTTOGO 
mGOCGQQGG 

GCTGdCTGC 
AGAyaAOCA 
Q.CCX'GGjUSGA 
OCAAQAOQTQ 
AGTACAAOGA 
TCCTGAAGAT 
TOGAGAAGTC 
GCOGGGGCTG 
QTGGCIACftA 
GCTGCTAT6T 
CCOCQTBTGC 
CAAGCTGCGG 
TCCrQGGTTT 
CnSTTCCACA 
TGGAOOGAAG 
AACaUVSXXAT 
AAAAAAAAAA 



1 11 21 31 41 51 

1 I I I 1 I 

CTOCGGOGOC OGCTCTGCCC GCCGCTGOGX COGACOeCGC TOGCCTTOCl> TGCAGCOGCG 
CCTCGOCCCC ATGGACGCCC TGTG060TTC CGGGGAGCTC GGCTQCAAQT TCTGOGAXTTC 
CAAOCTGTCT GTQCACACAG AAAACCXS3GA CCTCACrCCC TGCTTCCAGA ACTOCCTGCT 
GGCCT6GGTG CCCTGCATCT ACCTGTGGQT GGGCCiaCCC TGCIACTTGC TCTACXTEGOS 
GCAOCATTOT GGTGGCXftCA TCAXCCTCIC OGACCTGTCC AAGCTCAAGA TGGTOCTGQO 



12D 

IBO 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

ISOO 

1560 

1620 

1680 

1736 



60 

120 

180 

240 

300 



1311 
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TGTCCTGCTG TGOTQCQTCT CCTOGGCGGA OCTTITTTAC TCCTTCCILTG GCCTGGTCCA 360 

TQGCXX3G6CX: CCTGCCCCTG TTTTCTTTQT CACCCCCrrG GTGGTG66GQ TCACCATQCT 420 

GCTGGCCACC CTGCTOATAC AI&TATG2U3C6 GCTtSCAGCSGC GTACROTCrT CGGGGGTCCT 4 BO 

CATTATCITC TGeTTCCTGT GTGTGGTCTQ OGOCATOGTC CCATTCCGCT CCJiAOATCCT 540 

5 TTTAQCCAAG GCAGAGGCSTG AGA-TCTCAGA CCCCTTCCGC TTCACCACCT TCTACATCCA 60Q 

CTTTSOCCTG GTACTCTCTG CCCTCATCTT GSCCSGCTTC A6GGAGAAAC CTOCATTTTT €60 

CTCCSCAAAG AATGTOGACC CTAACCCCTA CCCTGAGACC AGCQCTGCSCT TTCTCTCCOG 720 

CCTGTTTTTC TGQTGGTTCA CAAaOATGOC CSLTCTATOGC TACCGGCATC CCCTGGAGQA 780 

GAAGGACCTC TGGTCCCTAA AGGAAGAGGA CAGATCXXSyS ATGGTGGTQC AflCAGCTGCT 840 

10 GGAGGCATGQ AGQAAGCAQG AAAAOCJW^AC GGCACGACAC AAGGCXTCA6 CAGGACCTGG 90 0 

6AAAAATGOC TCCG6CX3AG6 ACGAGGTGCT GCTGOSTGOC GGGCCX3VQ5C OCXXSOAAGCC 960 

GTCCTTCCTS AAGGCCCTGC TG^GCCACCTT CGGCTCCAOC! TTCCTCATCA GTGCCTGCTT 1020 

CRRGCTTATC CftGGACCTGC TCrCCTTCArC CAATCCACA6 CTGCTCAGCA TOCTI3ATCA6 lOBO 

GTTTATCTCC AACCXX»TGQ CCCCCTCKTa OTGOGQCTTC CTGGTGGCTG QGCTGATGTT 1140 

15 CCTGTGCTCC ATGATGCAGT OGCTGATCTT ACAACACTAT TAOCACTACA TCTTTGTQAC 1200 

TGGGCSTQAAQ TTTCaXhCTG aOATCATOaa TQTCarCTAC AOGAAOGCTC TGGTTATCAC 1260 

CAACTCAGTC AAACGTGOGT CXIACTGTGGG GGAAATT6TC AACCTCATOT CAaTGGATQC 1320 

CCRGCGCTTC ATGGACCTTQ CCCCCTTCCT CAATCTGCT6 TGG^CASCAJC CCCTGCAGAT 1380 

CATCCTQGCG ATCTACTTCC TCTGGCAGAA CCTAGGTCCC TCTGTCCTOS CTGOAQTCDGC 1440 

ZU TTTCATGOTC TTGCTQATTC CACTCAACGG AGCTQTGGGC GTOAAGATGC GCaGCCXXOCA 15 DO 

GGTAAAGGAA ATGAAATTGA A13GACTC606 CATCAAGCTG ATGAGTGABft TCCTQAADGG 1560 

CATCAAGGVO CrQAAOCTOT ACGCCTGQOA GCCO^GCTTC CTGA2USCAG6 T06A6GGCAT 1620 

CAGGCRGCGT GAGCTCCAGC TGCTGOGCAC GGOGGCXTTAC CTCX3VC3W3CA CSUMICACCTT 1680 

CACXrrGGATG TGCAGGCX:CT TCCXG«5TGaC CCTGAlTCaCC CTCTGGGTGT AOGTGTAOGT 1740 

Z5 GOACCCAAAC AATBTGCTGO ACGCCGAGAA GGCCTTTQTG TCTaTOTCCT XGXTTAATAT 1800 

CTTAAGftCTT CCGCTCAACA SGCTGOCCCA GTTAATCAGC AACCTGftCTC AGOCCAlGTaT 1B60 

OTCTCTBAAA OGQATCCAGC AATTCCTOAG CCAAGASSAA CTTGACCCCC AGA6TGTGGA 1930 

AAGAAAGACC ATCTCCCCAG GCTATGOCRT CACCATACAC AGTGGCAGCT TCACCTGOQC 1980 

CCaMSGACCTG CCCCCCACTC TGCRCSfWSCCT AflACRlCCAG GTCCCGAAAG GGGCACXGGT 2040 

30 GQCOGTGGTG GGGCCTOTGG GCTGTGGGAA GTCCTOCCTG OTGTCTOCICC TGCTGGGAGA 2100 

GKIG6ASAA6 CTAGAAGGCA AAGTGCACAT 6AAGGSCTCC GTOGCCTA.TG TOCCCCAGCA 2160 

GGCATGGATC CAGAACTGCA CTCTTCAGGA AAAeJO T OCTT TTCaGC3UW6 OCCVGAACCC 2220 

GAAGCGCTAC CAGCAGACTC ItSGAGGCCTG TGCCtlGCTA GCTGACCTGG AGATGCTGCC 2280 

TGGTGGGGAT CAGACAGAGA TTGGAGAGAA GGGCATTAAC CTOTCrGGQG GCC2AGCX3GCA 2340 

35 GOGGGTCA6T CrGGCrOSAS CXGTTTACAG TGATGOCGAT ATTTTCTTGC TCGATGACCC 2400 

ACTGrCOGCG GTGGACTCTC ATGTGGCCAA GCA^T^TT OACCACJOTCA TOSGGCCAGA 2460 

AGGGGTGCTG GCAGGCAAOIl GGGGAGTGCT GGTGftOGChC GGCATTAGCT TCCTGCCX3CA 2520 

QACAOACTTC ATCATTGTGC TAGCTGATGG ACAGGTGTCr GAQATGOaCC CGTACCCfiGG 2S80 

CCTGCTGCAG t^GCAACGGCT CCTTTGCC3WI CimCTCTGC AACTATGCCC COSATGAGQA 2€40 

40 CXAAGOQCAC CTTGQAGGACA GCTGQAOCGC QTTGGAAGGT GCAGAGGATA AGOTUSGCACT 2700 

GCIGATTGJ^ GACACACICA GCAAOCftOkC GGAXCCG&CA GACAATGIiTC CA0ICRCC7A 2760 

TGTGBTCCAG AAGCAGTTTA TGAGACAGCX GAGTGCOCTQ TCCiXUUSATG GGGAGGGAiCA 2820 

OGGTCGGCCT GTAOCCOGGA GGCACCXGG6 TCCTiTCAGAG AAOGTGCAG6 TGACAGAOGC 2880 

GAAGGCAGAT GGGGCACTGA CCCAGGAGGA GAAAGCAGCX: ATl^GGCACTG TGGAGCTC3SG 2940 

45 TGTGTTCTGG GAITATQCCA AGGCOSTGOG QCTCTGTA£!C ACGCTGGCCA TCTGTCTOCT 3000 

QTASIiTGGQT CAAAGTGOSG CIGGCATTGG AOCCRATGTG "XGGCTCAOTQ CCCGQACAAA 3060 

TGATGCCKTG GCAGACAiSTA GACAESAACSUl CACTTOCCTG AOGCXGOGOG TCTATGCTGC 3X20 

TTTAGGAATT CTOCAAGGGT TCTTGGTGAT GCXGGCAGCC ATGGCCATQO CRiaCGG&TGG 3180 

CATGCAGGCT GCCC6T6TGT TGCACCAGGC ACTGCTGCAC AACAAGATRC GCTCGCCACA 3240 

50 GICCTTCTTT QACAiXaU^AC C&TCAGGCOG CATOCTQAAC TGCTTCTCCA AGGACATCTA 3300 

TQKXSXTGAT 6AGGTTGTGG OOOCTGTCKT GCTCKEGCTG dXAATTCCT TCTOMOQC 3360 

ciacTCChcr cttgiggtca TCAxaaccaiG csuaKioacrrc ttcactgtgg tcatoctgoc 3420 

CCTGGCTGTG CTCTACACCT TAGTGCAGOS CTTCTATGCA GCCACATCRC GGCAACK3AA 3480 

- _ GOSGCTGGAA TCAGTCAGCC aCTCACCJTAT eTACTCCC3i.C TTTTCGGAGA CAGTGACTGG 3540 

55 !rGCt!AGTaiC ATCQGGGOCr ACAAC06CAG COGGGATTXr GAGATC3UrCA GTGA3ACIAA 3600 

GSTOGATGCC AACCAGAGAA GCXGCTACCG CXMSaXAVC TCC&AOCGST GGCIGJUSCKT 3660 

OOGASIGGAO TT03TGGGG91 ACTGCGTGGT GCTCTTTGCT GCA.C7ATTTG CC3IXCKK33& 3720 

GAGSAGCAGC CTGAAiOCCGG GGCTGGTGGG CCTXTCTGTG O^aCTACTOCT TGCAGGTGAC 3780 

ATTTGCTCTQ AACTGGATQA TACGAATGAT GTCAGATTTQ QAAXCTAACA TCGTCGCTGl? 3840 

60 GGAQAGOGTC AA6GAG27VCT CCAAGACAGA GACAGAGGGG CCCIOGSTGG TOGAAGQCAG 3900 

CCGCCX^XCrC GAAGGTTGQC CXrCCACGTOQ OGAGSTGQAQ TTCOGGAATT ATTCTGIGGG 3960 

CTACOTKX33 GGCCTAGRCC TGGTGCTGAG AGACSCTGAGT CraCATOTGC AOSGTaGCSaA 4020 

GAAGGTGGG6 ATGGIGGGCC GCACXOGGGC IGGC&RSTCT TOCATGACCC TTTGCCTGTT 4080 

CMGCATCCTQ GAQGCGGCAA AGQQTQAAAT OCGCATTOAT OGCXMAATG TGGCftfiACAT 4140 

05 CGGCCTCCAT GACCTGOGCT CTCAGCTSAC CATCATCCCG CftG<3ACCCCR TCXTTOTTCTC 4200 
OGGGACCCTG CGCATGAACC TGOAGCCCTT OGGCAGCTAC TCAGAOSAGG ACArTTTGGTG 4260 
GGCTTT6GAG CXGTOCCftCC TGCACACGTT TSTG2U3CTCC CAGOCGGCAG GCCTGGACTT 4320 
CCAGltSCTCA GAGGGCGGGG AOAATCTCS^G OGTOQGCCAG AGGCAGCTGG XGTGCCTGGC 4380 
„ COGAGCCCTG CTCOGCAAGA GCGGCATCCT GGTTTIAGAC GAGGCCACAG CTGCCATCQA 4440 
70 CCIGGAjSACT GACAACCTCA TCCAGGCTAC CATCOGCACC CAGTTTOATA CCTQCACTGT 4SO0 
CCEGAOCATC GCACAC3CGGC TTAACACTAT C^AlrGGACTAC AGCASGGTGC TGCTOCTGGA 4560 
CAAAGGA6TA QTAGCTGAAT TTGATTCTOC AGCCAAGCTC ATTGCSUBCTA GAOGGATCTT 4620 
CraCGGQATG GCCAlSASATG CTGGACTTGC CTARAATATA TTCCTGaSAT TTCCTCXTrGS 4680 
CCTTTCCTGG TTTTCATCAG GAAGGAAATG ACACCAAATA TGTCOaCAGA ATGGUlCrtrGA 4740 
75 TAGCAAACAC TGGGOGCACC TTAAGATTTT GC3\0CTGIAA ASTGCCTTAC AGGGTAACTG 4800 
TGCTGAATGC TTTAQATGAG GAAATGATCX: CCSUUTTGGTO AATQACACGC CTAAGGXCAC 4860 
AGCTAGTTTG AGOCAGTTAG ACTAGTCCCC OGGrCICCC3G ATTCXXIAACT GAC?rGTTATT 4920 
TGCACACTGC ACTGTTTTCA AATAAGOATT TTATGAAATG ACCrCTGTCC TCCCTCTGAT 4980 
TTTTCATATT TTCCTAAAGT TTCGTTTCTG TTTTTTAATA AAAAGCTTTT TCCTCCTGQA 5040 
OU ACAGAMSACA GCIGCTGOGT CAGGCCACCC CTAGQAACTC AGTCCTGTAC TCTGGGGTGC 5100 
TGCCTGAATC CATTAAAAAT GGGAGTACXG ATGAAATAAA ACXACATGGT CAACAGTAAA 5160 
AAAAAAAAAA AAAAAA 5176 

Seq ID HO* C180 DNA Sequmce 

1312 



wo 03/042661 



PCT/US02/36810 



Nucleic Acid Acceesion #s HM_aD4626 
Coding sequence: 1Z4.<1X8B ^ 

_ 1 11 21 31 41 SI 

5 I I 1 1 I I 

rAACCGSOCG CCTCC33CTCT CCCCGGCTGC AGGCGGCGT6 CAGGACCASC GGGOGCCGTO 60 

CAOGCOCSAGQ ACl-rCGGCQC GGCTCCTOCT GOGrrGIGAOC CX3GGGCG0GC COOCCGCGCG 120 

JVCGATGAGGG OGCOSCCGCA GOTCTGCGWS GCGCTGCTCT TGGCCCTGGC QCTCC3U3ACC 180 

GGOjraTGCT ATOaCATCAA GTGGCTGGCG CTOTCCAAQA CACCATCOGC CCTGGCACTG 240 

10 AACCAGAOSC AACACTGCAA GCaGClXSGAG GGTCXGGT6T CTGCACAGGT GCAGCTGTGC 300 

CGCAGCAACC TGGAGCTCAT GCAChOSGTO OTQCAOaCCa CCCGCBAGGT CA.TGAAGGCC 360 

TGTCGCCGGG CCTTTGCGGA CATGCGCTGG AACTGCTCCT CCftlTGAGCT GGCCOCCAAC 420 

TATTTGCTTTG ACCTGOAOAS AGGGACCOOG GAGTCXKKXrr TDGTGTATGC GCTGTOGGCC 480 

. GCCX5CCATCA GCCACGCCAT CXSCCCGGGCC TGGACCTCCG GCXSACCTGCC CGGCTGCTCC 540 

1!> TGCGGCCCCO TCCCAGQTGA OCCACXSOGGQ CX7CGQQAACC GCTGGGC3AGG ATGTGCGGAC 600 

AACCTCAGCT AGGGGCTGCT CATGGGGGCC AAGTTTTOGG ATCCTCCTAT OAAGGTGAAA 660 

AAAACAGGAT OCCAAGCCAA TAAACTQATG CGTCTACACA ACAGTGAAGI: GGGGAdGACAG 720 

aCTCTGOGC3G CCTCTCTGGA AATSRAGTGT AAGTGCCATG GGC5TGTCTGG CTCCTGCTCC 780 

ATCCGCACCT GCTGGWVOOa QCrOCftGGAG CTOCavSGAlG TGGCKSCrGA CCTCAAGACJC 640 

2U CGATACCT6T OGGCCACCAA OGTAGTGCAC GGACCCATGG GCAGCOGCAA GCACCIGGTG 900 

CCXZAAGGACC TGQATATCCB aCCTBTOAaO OACTOGOAAC TOGTCTATCT GC2&GAGCTCSV 960 

CCTGACTTCr GCATGAA6AA TGAGAAGGTG GGCTCCCAC3G QGACACAAGA CAGGCAGTGC 1020 

AACAAGACAT CCAACXSGAAQ CGACWSCXGC GACCTTATOr GCIGCGOGCX3 TGGCTACAAC lOSO 

CCCTACACAG ACCX3CX5TGGT CGAGDGGTGC CACTOTAAiCjr ACCACTGC3TG CTGCTAOGTC 1140 

25 AGCTGOCGCA GGTGTGAGCG TAiCCGIGGAG OGCXATGTCT GCAAGTGAGG CCCTGCGCTC 1200 

CaC3CX!CAOac AGGAaGOAGa ACrCTQCTCA AGOhCOCTCA GCAACTGGGG CCAGGGGCCT 1260 

6GAGACACTC CATGGAGCTC TGCTTGTGMl TTCCAGATGC CAGGCATGGG AGgOGGCTTC 1330 

TGCTTTGOCr TCACTTGQAA OCCACCAGOA ACAGAAGGTC TGGCCAOCCT GGAAGGAGGG 13 BO 

CAt3GACATCA AAGGAAAOOG ACAAGATTAA AAATRACTTG GCAOCCTGAS GCTCTGGAQT 1440 

30 GGCCACAGGC TGGIGTAAGG AGCGGGGCIT GGGATOGGTG AGACTGATAC AGACTTGACC 1500 

TTTCAOaaCC ACJUSAGACCA 0CX:TC3CGGGA AGGGGTCTGC GCGCCITCIT CAGAATGTTC 1560 

TCCOGGACCC CCTGGCCGAC OCrGGGGTCT GAGCCTGCTG GGCCCACCAC ATGGAATCftC 1620 

TAfiCTTGGGT TGTAAATGTT TTCTTTTGTT TTTTGCTTTT TCTTCCTTTQ GOATOTGGAA 1680 

^ _ GCTACAGAAA TATTTAXAAA ACATAGCXTT TTCtrTGGGG TCGCACTTCT CAATTCCTCT 1740 

J J TTATATATIT TATATATATA AATAT^ATATG TATATATATA ATQATCTCTA TTTTAAAAd 1800 

AGCTTTTTAA GCAGCTGTAT GAAAXAAAVG CrGRGIGAGC CGCAGCGGGC 'CCCTGCAGTr 1860 

COOQGCXZTOG TCAASTGAAC TGGGCAOACC CCGGGGCTOa CAGAGOOAGC TCTCCAQTTT 1920 

CCAGGCA 1927 

40 Sen ID N0« Ciei DNA Sequence 

Nucleic Acid AccesBion #s HH_031866 
Codlns sequence: 6.. 2090 

.^1 11 ai 31 41 SI 

45 1 1 I 1 I 1 

ACAGCATGGA GTGGGGTTAC CrGTTGGAAG TQACCTCSCaCT GCTGSCC3GCC ^TGGCSGCTGC SO 

TOCAGCQCTC TAGGGGOGCT GOGGCOGCCT CGGOCAAGGA GCIGaCATGC CSUUGAOATCA 120 

OOSTGCCGCT GTGTAAGGGC A7GGGCTACA ACTAOVOCTA CATGCCCAAT CAGTTCZ^AfiC IBO 

ftC33ftCACGCA AQACOAGGCG CGCCTCGAGG TQCACXIAGTT CTQQCCGCTO 6TGGAGATCC 240 

50 AOTGCTOSCC OGATCTCAAG TTCTTCCTGT GCAGCATGTA CAOGCCCATC TCSOCTAGAGG 300 

ACTACAAdGAA. GOCGCCGCOG CXSCTGCCQCT GGGTGlxaCGA GCGC3GOC3^ GCCX3GGIGC3S 360 

OSQCGCTC&T GOGGCAGTAC GGCTTCGCCT GGOO0GACG6 CATGCiGCVldC QACOSGCTGC 420 

CX3GAGCAAGG CAACCCIGAC AOSCTGO^GCA a:GGACTACAA COGCACX^GAC CXAAOCAOOS 480 

COGCOCCCAG CCaSCCOOGC CGCCTGCXXJC GGCX33CXS3CC CGGCOAGCSkG CXSacCTTDQQ 540 

55 GCAGOGGCCA CGGCGGOGCG CCGGGGGOCA. GGCCCCCGCA CCX30GGAGGC GGCAGGOGOG 60Q 

GTGGOGGGOQ GGAGGCSaSCQ QOBCCCCCAG CTaQOGaOGG CaGaSGOEGGC GGGAAGGCSGC .660 

OSGCOOCTGG 0GGGGGG6C6 GCTCOCIGOG AGOCOGGGTS CC3U3TGCGaC GOOOCIATGG 720 

TGAGOGTGTC CAQCGAQCaC CACCCGCTC^T AjCAACCGCGT CAAGACAGGC CAGATCGCTA 700 

ACTGCJGCGCT GCCCTQOCRC AACCCCTTTT TCRGCCAGGA CGAGOGCGCC TTCACOGTCT 840 

OU TCTGGATCGG CCTGTGGTC3G GiGCtCIQCI HOSVGICCAC CTT06CCACC GTCTGCACCT 500 

TCCTTATOGA CATGGAQOQC TTCAAGTACC GOGAGOGGCC CATXATCTTC CTCIOGGCCT 960 

GCTAOCTCTT 08TGXGGGT6 GGCIACCTAG TGOGOCTGGT GGOGGGCCAC GAGAAG6TGG 1020 

OGTGCAGCGG TGGCGCXSCCO GGCGCX3GGGG GGGCTGaaOG CX3GGGGGGGC 60GGCX3GCX3G 1080 

GOGCGGGCGC 6GCGGGCGOG GGOGCGOGOG GCOGGGGCGG GCGCSGGCGAG TACX7AGGAGC 1140 

05 TGGGGOOGGI GGAGCAGCAC GrGCGCTTACG A6AOCACGGG GOCOGOGCTG TGCACCGTGG 1200 

TCTTCTTGer GGTCrTACTTC TTOGGCATGG OCAGCTCCAT CTGGTGGGl^S ATCTTGTCGC 1260 

TCACATGGTT CCTGGOGGCC GGTATGAAGT GGGSCAAQ6A AGGCATCX3CC GGCTACTCGC 1320 

AGTACTICCA CCTGGCCGOG TGGCTIOIGC CCAQCQTCAA GTCCATCGCG GTGCTGGOGC 13 BO 

„^ TCAGCTCOGT GGACGGOGftC OCGGTGGOGG GCATCIGCIA OSTOQGCSU^C CAGAGCCTGG 1440 

70 ACRACCXGOG OGGCITGGTG CTGGCGCCGC TGGTCATCTA CCTCTTCATC GGCACCATGT 1500 

TCCTGCTGGC CGGCTTCX3TQ TCCCTGTTCC GCATGOGCTC GGTCATCRAG CRACAGGAC3B 1560 

GCCCCacCAA GACGCACAAG CTGGAGAflCC T<3ATGATOOG CCTGGGOCTG TTGAOOGTGC 1620 

TCTACACCOT QCCXIOCCGDa GTGGTGGTOS CCTGCCTCTT CTAOGAGCAG CACAACOQCC 16 BO 

„^ OSOQCTGGGA GGCCAGGCAC AACTGCCCGT GC3CTGCC5C3GA CCTGCAGOOC GAOCAGGCRC 1740 

75 GCftOGCCOOA CTACGCCXITC TTCATGCTCA AGTACTTCA3' GTGCCTAGTG GTGGGCATCA 1800 

CCTC3GGGOGT 6TGGGTCTGG TCCGOCAAGA CGCTOGAGTC CXGGOGCTCJC CTGTGCACCC 1860 

GCTGCTGCTG GGCCAGCAAG GGOGCCGOGG TQGGCGGGGG OGGGGGCGCC ACGGCOGCGG 1920 

GGGGTGGCGG OGGGCCX?GGO OOOGGCGGCG GCGGGGGACX: OGGOaaOGGC GGGGGGCGGG 1980 

GGGGCGGCGG GCSGCTOCCXC TACAGCOACQ TCaiQCSVCTGG CCTGACGTSS CQOTCGGGCA 2040 

80 OGGOGAfiCTC GCSTGTCTTAT CCAARGCAGA TGCCATTQTC CCAGGTCTGA G06GAGGGGA 2100 

GGGGGGGCCC AGGAGOC3GTG GGGAGGGGG6 C3GAGGAGACC CAAGTGC3U3C GAAGGGACAC 2160 

TTGATGGGCT OAGGTTCOa^, CCCCTTOICA GTGTTGATTG CTATTAGCAT GATAATGAAC 2220 

TCTTAATGGT ATCCATTAGC TGGGACTTAA AXGACrCSUinr TAGAACAAAG TACCIGGCAT 2280 

TGRASCCTCC CAaAOaC3U3C CCCTTTTCCT CCATTGATGT GOGGOSAGCT GCXCCCGCCA 2340 

1313 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



CGCGTTAATT 
TGGAGCCCTC 
GAGAACCTCr 
6AGGAGGGQT 
AAA1GC3CTTA 
TTCTTACA.TT 
AAAATATGTA 
TACCCCTQTA 
6ATTTAACCA 

TGAAXTTGAAA 
CTCTTTTCTQ 
AGGCAOTAAT 
AACCCIGAAA 



TCT6TTGGCT 
CCTOGCTGCA 
TTTTCXCCCr 
GACOGCEZACC 

AGAOGATBTA 
TATATCCAAA 
AGAACA6ATA 
TTtSCOCTCTC 
GGGAAnATOQ 
TACT2X3TTAA 
TCAAAATTQA 
ATOCATGGAT 
TCC3CCm3U3G 



GAGGAGGQTS 
CTTGOCXGQG 
COftCTCTTCC 
TCATGGGATT 
CAAGAAATGT 
TTTATATAAT 
GATATJVGTGT 
TAAGT&TTCT 
CCOCOCCTCT 
ACATTTTCIG 
ACTCTAGTTC 
GTTTTTGCAC 
AACAAC7CTC 
CCTTGITATT 



C3ACTCIGC(5C3 
TTTQCftGTCA 
TAOSTAAACT 
GCACG0TTTC3 
CTTAAXTATA 
TATTTGTTAA 
GTACATTTTX 
ATTTTGTCAA 
TCXGAOCTCKT 
GCTTGTCATT 
TtRAGTTGTT 
CXTCCCCAAA 
ACITTAGT{3C3 
TATCCTGCftar 



OGTTTCCAGA 
GATACACAGA 
CCCACCCCTG 
GSTATICTTA 
CaCXXJCACGT 
ATTGTAAAAA 
TTGTAAAAAG 
TAAAATOACT 
CAOCTITAAA 
CTGTAOVCTG 
AGOCAAGTAA 
GACQGTQTTT 
ATGTAAATGG 
GGTATCACTA 



ACCXX5AQATT 
TTTCACCTGG 
ACTTACXICTG 
ATTSACCAGGC 
AAATACQOGT 
AAAAAAGTGT 
TTTAQASGCT 
TTTGATAAAT 
GTBCTTGCTA 
ACCITAGQCA 
ATATCATTGT 
TTCATCGOAG 
AACTTCT6CA 
AAGGTTTCAA 



Seq ID NO: C182 DMA Sequence 
Nucleic Acid Accesfilon XM 0S0625 



1 
I 

CCQaC3TC5GA 
CGGGTGTCCC 
GCACCCAGCG 
CGGCTCCGCT 
CX3CTGCTGCT 
TCTITGGCCA 

AShOCATGSVA 
aCCACCOGGA 
TAGACGAQAC 
OGGTCATGTC 
AOGACAACGA 
AAGCIGCAAA 
AAACQCTTTG 
ACOGAGATAC 
TGTCCGAAAG 



ASCraGIGAT 
CCCQCftGCAT 
CTCCAaA[3CA 
TCCTAGCTGC 
OCAGCATTTC 
GOCCAOCCSeA 



11 
1 

GCOCCGOSGA 
□CTTCTOCGC 
AAG2USAGCGG 
CCCTCTGCCC 
GCTCTtOCTC 
GCCCGACTTC 
CXnCSGCATC 
GGAGGTQCTG 
CaCCAAGARG 
CATCCAOCCA 
CQCCTTGGGC 
OCXTTGCATC 
aOTATGTGAA 
TAAAAASQAT 
CAAAATCATC 
GGAOCTGAA0 
GAACGACKTC 
CACCICQQTC 
CCGCAAGCTG 
OGGCTGACCA 
TOC3U5TCTCA 
CTGAGXTATA 
ATCTTGTAGA 



21 
I 

GCTGOaCGOG 
GCCCXMSCOQ 
GCOCGrGGACA 
CCTC36GGGTC 
GCCTCGCACT 
TCCTAC»AGC 
GAATACCAGA 
GaSCAGOCCT 
TTCCTGTGCT 
TGCCACTCBC 
TTCX33CTGGC 
OCCCrCQCIA 
GCCTQCAAAA 
TTTGCACTOA 
CTQGAGAOCA 
AAATCG6TQC 



AASGeaxoac 

CAGTGCTAGT 
TTTCTGCTCC 

AGGCCAOUao 
AATATTC&AA 



31 
I 

GGCTTGCAGC 
CCOOCTBCai 
AGCTGGAACZ 
GOGCGCCXaVC 
GCTG0CTGG6 
GCaSCAATTQ 
ACaSiQOGGCT 
GCGCTTGGAT 
OGCTC3PrC3GC 
TCTGOGTGCai 

QCUGCGKCCIL 
ATAAAAATGA 
AAATAA7UU3T 
T^GAGCAAOAC 
TGTGGCTCftA 

axctgbtcat 

AGAAGGGGCA 
CCOOGCATCC 
GGGAT CTCftG 
Cri'CUUCCTG 
AGTGGATAlGC 
CTAATaiAAAT 



GCCTCGCGOG 
GCmTCXSGQ 



GATGCTGCAQ 

CTcoacaoGC 

CAAGCCCATC 
GOOCAACCTQ 
CCCGCTOGTC 
CGOCGTCTSC 
GGTQAAGGAC 
TGA6TGCX3AC 
CCTCCTCCCA 
TGftTGACAAC 
GAAGOAGATA 
CATTTACAAG 
AQACAGCTTG 
GSGAC2U3&AA 
QAlSAGAGTTC 
TGATGGCTCC 
CTCCX3GTTCC 
C CTTTTG Cac 
TOTTTTCACC 
CSUreSATATT 



Seq ID NQ; C1B3 BEIA Sequence 
aruclelc AaiA AcceBaiozL IIH_001306. 
Oodiug sequences 199.. 861 



AATTOGGCAC 
GOGGOOGCOG 
CJQCCAGGCCC 
GGOCTTOCCG 
CTOGGCTGGC 
ATCGGCAflCA 

gtgcagagca 
gaocttcagg 
ctagtogcgc 
aagatcaoca 
gtgtcctqqt 

C!AGAAG0GOG 
CTGGGOGGOG 
AAGGTCXJXCT 
GACOGCAAGG 
CAACACCACC 
GCCTCOaAOG 
TCCOC3U3CAG 
GCATGOACTG 
ACCAOCCCCr 
GCCCTGGAAG 



11 

1 

QAGGGCnSGT 
TOGGTCtnOTC 
AQC5GGC0CCG 
OGGCAGCCT^T 
IQGGCAOCAO: 
ACATCATCAC 

CGGCCGGCfiC 
TGGTGGGCGC 
TC^TGGCAGG 
OGGCCAACAC 
AGATOGGOGC 
CGCTGCTCTG 
ACTCCGOGCC 
ACTAOGWTO 
ACCACOWOG 
OCAGOOChCC 
CC3V0GGCTTT 
TGAAACXZTCA 
OGAGOCKCATC 
GGGCACTTGA 



21 
I 



31 
I 



MTTCaCSSCCB 
GCCGCT06TC 
GTOCATGGGC 
CXSTGTGCIGC 

GCAGTacAaa 

CCTCAT0GT6 
OCAGTGCACC 
C^GTOCrOTTC 
(ATTATGGGG 
GGGGCTGXAC 

GCGCTCOVCC 
AGdOACAGAC 
OGAGCTQQAa 
CXXnSAAGCC 
GC3GGOCC50Ga 
CCCTTCTGGA 
CGGGCCGCTQ 
TATTTTTCAA 



TCOGTOCGTC 
TCCOOSCAOC 
CTGGflS ftaPCA 

GcarrTGCCCA 

ATCTQSGnSS 
OTGTAGGACT 
GTQGOCATCC 
AACTGOGIGC 
CCTCTOQCOQ 

GTGOGcraoQ 

CCCCCaOGCG 
GGOCCGQOAG 
GCA08GA6AC 
06C3SQkOCAa 
A6GAAGCX3CC 

GCAGGGiSSOC 
CCCCCATGTC 
KAAAAGGCIC 



41 
I 

OGXATOQAOC 
CEKTOGGGGOS 
06GAGCCACC 
C3QGGCAOOG0 
TGTGSCQCaT 
GOCTGTGCSAT 
CGCTGCTGGC 
TGCXOGCOGC 
AGGAOGACAC 
CCClGCTCnC 
ACOQCGTOGT 
OGGOCGOSQC 
AGAAOAAiGrA 
CCAGCCTGGG 
CCCACCACCA 
GGCKTCSCftfiC 
CGC3GCIOGAC 
GGGGGCCCAS 
TGGGTGACCG 
GOGCTGGGCA 
TCGTTTTAOC 



51 

I 

C3QCTQT0CTC 
GOOCCGAjGTC 
CGCCCTTCCC 
GGOCCTGGCr 

gggcxcttoc 

OCTOCCAACC 
CT6GGCCAGG 
ATGAAQCAGT 
CTCX3ATGACC 
OGCTQCaCCC 
OGTTTCCCCC 
GCCACO^AGG 
GACATAATGG 
ACCTACATCA 
CTGAAC3GGTG 
CAGIGGACCT 
CAGGGTGGGG 
AAQCBCATCT 
GACAGGCCTG 
CCAAGCACAC 
QTTTGCATCC 
XAAAGGAAAA 
TTTATGAAGT 



51 
I 

OGnSCOGtXA 
OCSGCAiOCTCC 
OGGTGGAGGG 
GCTOGCOGTG 
OTOGGCCTTC 

ACTGCCACAS 
CTTCGGGCTQ 
GGCCAAOKX: 
CCTCGTGC5CG 
GCXXSGMSGO? 
GCTGC3U3CTG 
CACGGCCRCC 
CACAOGCTAC 
CCAjCCACCAC 
gtqcm3c3ctt 
rrGGGGCHGCT 
GGAOCAACCT 

ccaatacttg 

GGGACOGGCA 



Seq ID HOt cia4 SXfA Sequence 

NUdelc Acid Accession 13M 012449.1 

CSoding sequences 66 1085 ~ 

L " " " r 

OOSAGACTCA CQGTCAAGCT AAGGCXSAAGA GTOOaTGGCr GAAGOCAIAC TATTTTATAQ 
AATTAAXCm AAGCAGAAAA GACATCACSUi ACCAAGAi«3A ACTTTGGAAA ATGAAGOCTA 
GaftGAAATTT AGAAGAAGAC GATTATTIGC AVy^GfiACAC GOQACIAGACC AQCATGCTAA 

1314 



2400 
2460 
Z520 

2640 
2700 
2760 
2820 
2BB0 
2940 
3DD0 
3Q60 
3120 
3100 
3195 



60 

120 

160 

240 

300 

360 

420 

400 

540 

600 

6G0 

720 

7B0 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1382 



GO 

120 

160 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 « 

1080 

1140 

1200 

12S0 



60 

120 

180 
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10 
15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



7VAAGROCTOT 
CAOAACTTCA 
CTAXTATAQC 
CAACTTCCCA 
CAATCSGTTTC 
TCCAACTTCA 
TAACAAQAAA 
GTCTGTCrrA 
AG6TGCAACA 
ATGTGTCTCT 
CATCTGTGAG 
TTGTTTCCCT 
ATATAAAACA 
TT6ITGTCCT 
AGATZASUCA 
TGTAGAATTA 
TCAAOTTTOT 



GCTTTTGCAT 
GCACACACAG 
ATCTCrrGACT 
TCAACAATAT 
CAT CACTCTC 
TAATQGAACC 
GCAGTTTTGGG 
CCCPATOAOG 
AAATAAAGAfk 
GOOAATTGTG 
TGACTCTTTS 
TCTACTGOGC 
ATTTGTATGG 
GA.TATTTAAA 
TGGTTGGGAA 
CTGTTTACAC 
ATTTGrTAAT 



TTGCACCAfeA 
GAACTCTTTC 
TTTCTTTACA 
TTTTATAWiA 
TTGGCATTGG 
AAGTATAAQA 
CTTCXCAGTT 
CGATCCTACA 
GATGCCrGGA 
GGATTGGCAA 
AC3VTGGAGAe 
ACAftTACACQ 
TATACRCCIC 
AGCATACTAT 
GAC8TCAGCA 
ACATTTTTGT 
AAAATGATTA 



CRGCCCATQC 
CACAGTGGCA 
CTCTTCIGAG 
TTCXaWiTCCT 

TTTACcracc: 

AGTTTCCAC3V 
TCmiTTTGC 
OATACAAGTT 
TTGftGCATGA 
TACTGGCTCT 
AATTTCACTA 
CATTGATTTT 
CAACTTTTAT 
TCCTGOCATG 
AAATTAACAA 
TC»ATATTGA 
TTCAAOSAAA 



TGATGAATTT 
CTTGCCAATT 
GGAAGTAATT 
GGTCATCAAC 
AGGTGTGATA 
TTGGTTOGAT 
TGTACTGCAT 
□CTAAACTGG 
TGTTTGGAGA 
GTTG6CTQT0 
TATTCAGAGC 
TGCCTGGAAT 
GATAGCTGTT 
C3TGAGGAAG 
AACTGAQATA 
TATATTTTAT 
AAAAAAAAAA 



GACTGOCCTT 
AAAATAGCTG 
CACCCTTTAG 
AAAGTCTTGC 
GCAGCAATTG 
AAGTGGATGr 
GCAATTTATA 
OCATATCRAC 
ATGGAGATTT 
ACAfCTATTC 
AASCTA6GAA 
AAOTOGATAa 
TTCCTTCCAA 
AAQATACTGA 

TGrrcccAGi: 

CACCAACATT 
AAAAA 



Sec( ID HO I C185 PNA Setjuence 

Nucleic Acid Accession #i I3M_00I775-3. 

Coding sequence; 70,, 972 '~ 



1 
I 

CTAAASCTCT 
TGGAGCCCTA 
CTCTCTAGGA 
GTGCTCaCQG 
CGCTTTCCOS 
AGACATOTAG 
GCTTGCAACA 
CCTTGCRACA 
GTCCAGCGGG 
ACATGSTQTQ 
AftGGACTGCA 
GAASCTGCOT 
AAAAAC3U3CA 
CTAGAGGCCr 
ACCSkTAAAAG 
ATCTACAGAC 
TCTQAGATCT 
CATCATACAO: 
CAATAAGOTC 
CATCAGCATA 
AATGAAAATT 



11 
1 

CTTGCTCSCCT 
TGGCCAACTG 
GAGCCCATVCr 
TGQTCGTCCC 
AGACCGTCCT 
ACTGCCAAAG 
TTACTGAAGA 
AQATTCTTCT 
ACATGTTCAC 
GTGAATTCAA 
GCaACAACCC 
QTGATGTGGT 
CTTirGGGAG 
GGGTIGATACA 
AGCTGQAAXC 
CTGACAAQTT 
GASCCAGTCG 
GACTCAGC^kT 
AATOOCAGAG 
dCTTTATTOT 
GTATGTTAnS 



21 

1 

AGCKTCCTGC 
CGMTTC3U3C 
CTOTCTTGGC 
GAGGTGGOGC 
GGCGCGATGC 
TGTATG6GAT 
AGACTTATCAS 
TTQGAGCAGA 
CCTGOAQQAC 
CACTTOCAAA 
TGTTTCAfiTA 
CCATGTGATG 
TGTGGAAGTC 
TGGTGGAAGA 
GATTATAAGC 
TCXTCAffTGT 
CTGTOGTTGT 
ACCTGCTGGT 
AOGGAAGCCX 
tSATCTATOVA 



31 
1 

OSGCCTCATC 
CCX36TGTCCG 
GTCAGTATCC 
CAGAOGTG6A 
OTCAAGTACA 
GCTTTCAAGG 
CCACTAATQA 
ATAAAAGATC 
ACGCTGCTAG 
ATAAACTATC 
TTCTGQAAAA 
CTCAATGGAT 
CATAATTTGC 
GAAGATTCCA 
AA?UV6GAATA 
GTOAAAAATC 
TTTAOCTCCT 
GCAGAGCTGA 
IXTTCIMCAA 
TASTCAftGAft 
TAG 



41 
1 

TTCGOCCAGC 
GGGACAAACC 
TGGTCCTGAT 
GGGGTCOGGG 
CTGAAATTCA 
GTGCATTTAT 
AOmiGGGAAC 
TGGCCCATCA 
GCTACCTTQC 
AATCTTGOCC 
CGQTTTCCCB 
CCCGCAGTAA 
AACCAGAGAA 
GAGACTTATG 
TTCAATTTTC 
CIGAGGATTC 
TGACTCCTTG 
AGATTTTGGA 
AGTCXXAAAA 
AMVTTATTGT 



Seq ID NO: C186 DBA Sequence 
NiKJlalc Acid AccesQloiL #j 3C«_120513.2 
Oodisg seg|uenc8$ 1..220& 



ATOGTGTCAT 
TACGOCTTGC 
AGGACCCCAA 
TTTTCAGAAG 
TACICTGAiCG 
I3TCATGCAGG 
GCCI0GA6TC 
GCCCCCCGCB 
ACGGCACAGA 
OGGAGGCACC 
CCOAGiaCGCC 
ATCIACMU3C 



CC3GGCGCAGC 
GTCAGACrCG 
C060CCCTCC 
TCCCX96CCAT 
GATACACATA 
TTTG6ATTGA 
ATTCirrOQAT 
OTGGGGGTOC 
TTGQATGCCT 
CTCCTGGACT 
TACAACXSAAA 
AATOTAQACA 
GTTTTCACia 
TGGTCAGCCA 
CATACATTAG 
TCCAGAGAGO 
GGTOCTTCAA 
CTT6W3GATG 
ACCTOCTIOQ 



XI 
I 

aCAOGTTCTC 

AGCCCCGTAA 
QATOGGA6AG 
ACCCCGCCXX: 
GOGGGCCAOG 
CACAXeiGGG 
GGGGGGCTOS 
CACa^CJiCCCT 
CTCGGAGOCC 
GGATGGAGGA 
AGC8CAGCGC 
GCTAOGAGOG 
0GCCX3CCCGC 
GGGCTCCTCC 

rrccTGCCCC 

C00CTO6TOC 

rrGTOQAQaTC 

TG6CTCAGAG 
CITGGGGTAC 
TGCTGCCAAG 
TTGCCChlSCA 
CCAACCATTC 
GACAGGAGCA 
TAQAGATGCA 
ACTCTCTACA 
GTGAATATGG 
AAAAlTAXnC 
CCXCTQTCAA 
GACCATTTTC 
GCATCX3GCAS 



21 
I 

GGGGOCXSCTA 
GXTCCOCSKTG 
ACACOGOGOS 
CAA7TCCC6A 
AACGACIAGC 
AGCXCXirTTTC 
aVTTGGCQAa 
GGGAT60CA6 
CCCACAOGOG 
GGOGOCCGGC 
AGAjSAXGCAS 



AOOCACCCIG 

GOGTX:CGCCA 
GGGAGTCGCA 
OCGTCOGTOG 
CCAfSTCP^CA 
AAGAT6GCAG 
TTGGCCA3GT 
GTCTGAGGGT 

TCRGTCCATG 
CTGTCACATT 
GTATATGCMV 
AAATTACCTG 
CGATCIGGCC 
ACTFTGATTCC 
GAGTGATlTT 
AGGGGCACAG 
OCCCGTCCCT 
GCIGTTGAAC 



31 

I 

OGGGAAACAA 
AGdCAGAGG 
AOCAOCOGGG 
TTGGTAGAAS 
CCCTCCTCTG 
GGAAaOCGOA 

0CTCM3CGAC 
OGCAOCAGGG 
GGGGAOGGGA 

caaacAGAGG 

CTCA2UIAI0GT 
GTGGAGCTGC 
aCCTCCTOGT 
OGCOGTGGAT 
GGCACTCn^C 
CACeCIGOCG 
CTTGCaSGAT 
CATAGATCTI 
GGACAGGATT 
AArrACTGCTC 
OTTCXTAiGOC 
AWrCTTGCG 
GGGAAAGCGG 
AGGAAACAAC 

GAOGTOGATC 
ACAGCTTCCC 
CCAAATCCTT 
CAAAGCATCG 
TTQAGnrOTG 
AACTATOCTG 



ATGAAAAOGT 



OCAAGATCrr 
AACTTOCTGT 
TGCaACOOOG 
GCCCGGQC6C 
AATCM9QA6C 
JU9GCGGGG6C 
CAGACCCOGC 
CGTGCTOCGA 
AGGOQCCXAO 
TGCCCAGGAA 
OSCACXSGOCA 
CXyrCTTOQTT 
TTCGGGCX3CG 
TCC3CACCGCC 



CTGACACCAT 
TAAAGCAOTT 
GGCTGGA6AA 
CTAAGAAGAG 
TCTTAAATTG 
TAAAGCAfiGA 
TCCACASTCA 
AAACTTCTGC 
GCTTTCCAAC 
CXCIGTCAAC 
TGTGTAAATC 
TGCMaCAGGC 
CTTAC31GGGT 
AGGCCtTGGA 
TCTACATAAC 



51 
1 

CAACCCOGOC 
CTGCTGCCDGG 
CX!TCGTCGTG 
CACCACCAAG 
TCCTGAGATG 
TTCAAAACAT 
TCAISACCGTA 
GTTCACACAG 
TGATGACCIC 
AGACTG6AGA 
CAGQTTTQCA 
AATCITTGAC 
GGTTCAGACA 
CX3M3GATCCC 
CTGCAAGAAT 
ATCTTGCACA 
TGGTTTATGT 
GGGTCCTCCA 
TAACTTATAT 
ATAAGRTTAG 



51 
1 

GAAAAAGTTC 
COGGGAAAGC 
CAAGAGGTTC 
AATACAC3U33 
AGAGTTTGGG 
AGCCGCAGAA 



GGCGG(3G6CC 
GGGCA3G0GG 
GGGACOGGOC 
CSTCCSDCAAA 
GAGAGCGGOG 
CXTGAGGiACT 
CGCOGCTGTC 
GGGAACCATC 
CAOGTOGTOG 
ASGGftCCrCA 
6GTCTCTGTC 
TGAGTGGGGA 
GGAOGGTCAG 
TOGAATQATC 
TGGAGGAAAA 
AGGCTCAAGT 
GAGCTCAGAC 
GTTTTTG7VGG 
TCCAAACCOC 

TAGGCATCTA 
CTTGGCTGGG 
GAACTCTGAA 
AATGGATTTG 
GAGCAAACAfi 



240 

300 

3^0 

420 

4BO 

540 

600 

660 

720 

780 

640 

500 

960 

1020 

10 80 

1140 

119S 



60 
120 

leo 

240 

300 

360 

420 

480 

540 

600 

6E0 

720 

7B0 

840 

900 

960 

1020 

1080 

1140 

1200 

1233 



60 

120 

IBO 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1060 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1660 

1740 

1800 

1860 

1920 



1315 
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10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



TOGGAT62USG CltSTAAATTC TTCAAAGAA& GA.l'GGGAGAC GGCTCCTICG ATACCTCATC 1980 

AOATTTGTTT TCACAAC06A TOAGCTTAAO TACTCATGCQ GCCTTGOQAA AAf?OAAAAGQ 2D40 

TCAGTGCAGT OySGACSAOAC AQOTCCCGAA AGACGCCCTC TGGATCCAGT TAAA6TAACA 2100 

TQCCTCCC3AG GTACTGCATC CTTCCGCTCA. GTGTCA.C3ZAT CTGTGATCTC ATTTCACCGC 2160 

ATIGGCTGTG GCTCTCCCXX5 TACAAGTGlTT CAGCCTTCTG TATTTTGA 2208 

Seq ID XK>.- CI 87 DNA Sequence 
Nucleic Add Accesaion ft: AB037745,1 
Goding sequence s 2 6.. 1744 

1 11 21 31 41 51 

I 1 I I I I 

ATOaTGGARC AOOCTGCCCA CAAACAltSGA AACGACCGTT CTCAQTOOQA TCAACTTOGA 60 

GTACAAGGGC ATGACAGGCT GGGAGGTGOC TGGTGATCAC ATTTACACAG CTQCTOSAGC 120 

CTCftflACflAT GACTTCATGA TTCTCACTCT GGTT3T(3CX3t GQATTTflGAC CTCCGC3W3TC 180 

GC3T6ATGGCA 6ACACACAGA ATAAAGAOQT GGCCA6AATG ACATTTGTCT TTQAOACCCT 240 

CTGTTCTCTG AACTQTGAGC TCTACTTCAT QUTGOGVCyca AATTCTAISGA CCAACACTCC 300 

TGTGGAGACG TC?QAAA(?GTT CCAAASOC^ ACAGTCCTAT ACCTACATCA TTQAOeAQAA 360 

dRCTACCACG AGCXTCACCT GGGCCTTCCA GAGQACCACT TTTCATGAGG CAAGCAGGftA 420 

GTACACCAAT GACGTTGCCA AGATCTACTC CATCAATX^TC ACCAATGTTA TGAATGOCfiT 480 

GGQCTGCTIU:: TGCCOflCCCT GTGCGCTASA AGOCTCTGAT CXGOGCTCCT OCIGCACCTC 54Q 

TTGTCCTGCT GGTTACTATA TTQACCBAGA TTCAG6AACC TGCCACTCCT GCCCCXXITAA €0D 

C5\CAATTCTG AAAGCCCACC AGOCTTATGG TGTC3CAOGCC TGTGTGOCCT GTGGTOCA^ €SO 

GACCAA0AAC AACAAGATCC ACTCTCTGTG CTACAATGAT TQCACCTTCT CACGCAACAC 720 

TOCAACCAGG ACTTTCAACT ACAACTTCTC CGCTTTCSGCA AACACOGTCA CTtJTTQCLQa 7 BO 

AGOaCCAAOC! TTCACTTCCA AAGOGTTGAA ATACITGOVT CACITTAlCCC TCSWSTCTCIG 840 

TGSRAACCAG GOTAGGAAAA TGTCTCTQTO CACCBACRAT GTCACTGACC TOC3GQATTOC 900 

TtSAGGGXtSaG TCAGGGTTCT CCAAATCTAT CSWCAOCCTAC GttCrrGCCftlQG CAGTCATCAT 960 

OCCCCCAOAO GTOACAQGCr flXMG0CC3GG GGTTTCCTCA CAQCCTQTCA SCJCTlXSCTGA 1020 

TOGACTTATT GGGGTGACAA CAGATA^GAC TdQGATGGA ATCACCTOCX: CAGCTGAACT 108D 

TTTCCACCTO QAGTCCTT08 GAATIACCGGA GGTGATCTTC TTTTATIUSGT CCAATGATGT 1140 

GA0GCA6TCC TGCAGTTCTG GGAGATCAAC CACCATCCGC GTCAGGTGCA GTCCAGAQAA 1200 

AACTGTCCCT aOAAOTTTGC TGCTGGCAGG AAOGTGCTCA GATGGOACCT GTGATGGCTG 1260 

CAACTTCCAC TTCCTGTGGG AGAGCGOGGC TGCTTGCCaS CTCTGCTCAG TGGCTGACTA 1320 

CCnTQCTATC QTCAGCAGCT GTGTGGCTGG GATCCAGAAG ACTACTTAOG 1<?tX3G0G&GA 1380 

ACCX»AGCIA TGCTCT6GTG GCATTTCTCT GCCTGAI3CSU3 AGA6TCAGCA TCTGCAAAAC 1440 

CATAfaATTTC TGGCTGAAAS ^TGGGCATCTC TGCAGGCACC TQTACIGGCA XCCTGCTCAC 1500 

OGTCTTGAOG TGCTACTTTT GOAAAAAQAA TCAAAAACTA GAGTACRAjGT ACTCCAftSCT IS SO 

GOTOATGAAT GCTACTCTCA. AGGACTGTGA CCTGCCAGCA OCTGACAGGT GCGCCATGAT 1620 

G6AAOGOGAG GATGXAGAOG AOGAfXTTCAT C^TTTACCAGC AAGAAGTCAC TCTTTC3GGAA 1680 

GKTCAAATCA irTZAOClCCA AGCftGCJCAGC TGCTGTCACC ATCTClCTTT CAGAGGACTC L740 

CTGATGGATT TGACTCAGTG COGCXOAAfaA CATGCTCAG8 AG6COCAGAC ATOGACCTOT 1800 

eaOAOQChCr GCCWSCCTCA CCTGCCTCCT CRCCTTGCAT AGCACCTTTG CAAGOCTQCG i860 

GOGATTTQGG TGCCAGCATC CTGCAACACC CACTQCTG6A AATCTCTTCA TTOTGGCCTT 1920 

ATCAGATGTT TOAATTTCAG ATCTTTTT^ ATAGASTAOC CAAAGC3CTCC TTTCTGCTrG 1980 

CGTCAAAOrr GCCAAKTATA OOCACACTTT OTTTOTAAAT TAl:GOCCTT6 CTTGTATCTT 2040 

□TTVQCXaAA ATOSCXXAXC CGCSCnSAlGCC ATAGCITOGT CTGCTCATAA TTCTTATAGC 2100 

TTTQGAATGA AAATATTTCT ATCTTCTTAA QTATASAAAC TATTTCCTCT GTCCTCTAAC 2160 

TTAAfi£36CA6 AAAC5«3CT©G GAGTTXTCCT CGCATGCCCT CAflCTCATGA TCTCTTCAGG 2220 

AGA6AGGCTG GGTGAOGAGG aXGTOQaGGT TCCCK3GTG6 ATAATCTTCA TAOCSkGCCTG 2280 

GA3QC%TITG CGCTGK3ATAA CCAGGICRAA GGGAOTOAAA ATOGTAGTCr GiUSGGChaOG 3340 

GGA0CAA3GC CTOGGTAAQA AAAOCCTTGA AAM3CAXAAA AAGAiGGCCGG Q03CGQTGGC 2400 

TCAG6CCTGT AATCCCAGCA CTTTOOGAGG CCGAtSGOGGG CAGATCATGA G6T0GGGAGA 2460 

TTGAOAGCAT CCrGQCTAAC ACGGXGAAGC CCOGTCTCTA CTGOAAATAC AAAAAAITAG 2S20 

CGGGGCGIGS TG60G0GTGC CTGTGGTGGC AGCTACXCG6 6AG6CTGAGG GGGGAGAATA 2580 

GGSTGGGOCI 8GAAG(3at3t3A GCITGCAGTG AjSOCOAISATC aCSaCQftCTOC ACTCCATCCft 264Q 

GCCTGGGTGA CftSAGTOAGA CTCTGCCTGA AAAA?kAAAAA AAAAAA3U3AA AAGCAC3UUUI 2700 

AQ3U5QCAACA AGGAATGTTT TXGTTTTTGA GACAGGCTCT CACTCIOTCA CCn^AGGCTGG 2760 

AGTOCAGTGG CGTAATCACT GTTCAGTQCA GCCTCAAGCT CTTGQGCTCA GGCTATCCTC 2820 

CCATCTGAGC CTCTCAAGTA GCXGGOACTA C3GA£3TGTOCA OCACCAGGCT CACTAATTTT 2880 

TGTGTTTTTT OTABAiCACaO GStTTCAOGQ TGTTGOOCIU} aCXGOTCTCC AadCClGGG 2940 

CTCAAGTGAT CTGTGOGGCT CGGCCTCCCA AACTGCIGG6 ATEACAGGCA TAAQOCACia 3 ODD 

C!ACrCAflOCT TTTATTTSTT TTTTAAACCA ayiMCTCAT TGCCTTCTCT TAAJSCAAATG 3060 

ATAOATATTC TCACTGAAOC CAAZU^QAATA AGTTCRTCAA GAAAATGOCC AAAGCCCTGG 3120 

TGGATACAIC CTCCCXATCT TTTTTTTAAA CGTTGCACTA TCACTCTATG AGACTGAAAA 3180 

GAACCAOGTA A0CCOGAAAC CCAGATOTTC CnTOCITATC CTCTASXGGG TTTAGCCACA 3240 

GftCKTAGCMk ACGCTGTCA6 TGAGQAAAA7 TCCCCAXCCT IGAGTGCGOC OSIOCTAOIUV 330Q 

OTTTOGGOCA TATTATQOAA CAGGGCTCTC TTATTTGAAA AjG3W3CRCftAG OAQGCX3««5A 3360 

TTTTAATGGG GCACTTTAGG GOATACAGOC CACM^TOGCI^ TGGGCCTGM GrGGOOGTGA 3420 

TGTCTGCrrC TAAGCTXAAC GCATCTGCTC AGGCACAGAA TAAAGGTCTA GGCTGGCCAA 3480 

AAAAOGAACT GAATCCCAGG CCCAXAOGCC AGCACCAQAA TCAAACCA6T CTCCIUhBGAA 3540 

tSGAAGGCTAG GAGASTTTAA CAAfiATTTTTC ACIGGGOGCA GCATGGTO0C TC3UACCTGT 3600 
AATCCCAAGG CAGAATGGTG GCTTGAGCTC AQQIUGTTCAA QACCAGCCTG GGCAACRGAG 3660 
TQAG&COCTG TCTCTAAAAA ATTTAAAAAT AAACAAGGTG TTCACCAAGC TGGOATACTT 3720 
CTCACTATTA AGGCCCTATC TTTCTCTTTT TTTCATTCTC AATTGCTTTG TGTGATAAAA 3780 
AACTAAAOAa ACTTCTTGGTC CAATTTCTQG CAACATGCCT TCTGAAACgOT aAl9Z!AGAGG!G 3840 
GGTGTCTTCT ATGCCCRTTT TCCCCAATTT TACACAAACT KmOCfOOG AACTTTTAAG 3900 
TACCTAdAAT GGGTAAAACX: AI5AGCAAGAC TTTAAATTAC CTTCTTCTTT CITCTACIGG 3960 
CAGTrCTQCC TCXaiTC!ACTA TCAGGCTAGG QTOACCTTCC CTTGGTGAAG OOOCAATTGC 4020 
CCATGATTTG TGCCTGTGCC CTTTCTCCAG T6AOCATTTG GTGACCAGAT OSI3VSATA7A 4080 
QZAAGGGSAT GGCATTTGCA AiSTGACTAOT CTGCCACAAA ATGCTCSVTCT OATiaOCCaC 4140 
TGCraoCCTG □GAATGOCTT TGTAAOAGTC AASTGAGAACT AGAQCCAGGC XGTGGTCCXT 4200 
GGC CATCAAC .AGT6XTGGT6 AGGGCAGGGA. QTCCCTTTQG TTTAATAAAT GCaUSTTTTTC 4260 
TTTGGGTATC GAAATTCTCC CCTCCTTTTO TAGGAGTCAG GCtCICAaAA CCIGTGTCCA 4320 
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TGTTGGAl^ TGCCCCaOTG TGGATGCftGA TAOGCAGCTC CTGACCTCCA GCC TAftftGT C 43 BO 

TTCTOTAGCC TCAGCAATAC TTQQg Q WOCT GOTGTCTCAC TGaUVTAGCTT TCTTTTOTQA 4440 

CAAAGGCCAC AGACAGCCCT TAGACTATTC CGGAAACAGT AGGAAAAATT ACATATGTCT 4500 

TTGACTTCTT TATTCTGACT OCACTGATTT TAOCCATAAT ACTTTAAtSGA GCTACTTTTT 45 SO 

5 ACTACCCCTT ACCGTGCXGA CTTCTQCAGG TCTGCCCTGT GACCTGTCAfi GAACTCCTGA 4620 

6TIAGBCTAC TGGGGTCACC TGTTGCTDCC CTAOCAAGTT AOGCATGTCA TATATTTTTA 4680 

ACaVGCTTTAT TGA6ATATAA TTCACATATT ATACAATTCA OCTTTAAAAC ATAGGATTCA 4740 

ATOGTTTTCA GCAAACTCAC AGAjGTTGTCC GCCCACTTGA OAGCAAAGAC ATQTTCRATT 4800 

TTCTTTTCCT TTTTTTTTTT GAGACRGRGT CAGCTTTGTC GCCCAGQCTG GAGTGCAGTG 4860 

10 OCATGATCTT GGCTCACKSC AGCCTCCOCA TCCTGGGTTC AAlSTGATCCT TCTGCTTCAQ 4920 

CCrCCCCAGT AGCIGGGATT ACAGGCATGC GCCS^CCACXaC CTAGCTAATT TXtGlGITTT 49B0 

TAGTAGA6AT GGGGTTTCAC OOTGXTGGCC AGGCTG6TCT CAAACTCCT6 GACTCAASTB 5040 

ATCCACCCAC CTC3GGCCTC3C CAARGTGCXa QQATTGCAGG TGTGAGCCAC CX5TGCCTGGC 5100 

CTACGTGTTC AATTTTCTAT DAACAAAGGC TTTAGTCCTT QACCCAGGGC TAAAGTGQTC 5160 

15 TQTCCAAGCT OrrGTTGGTA GAGGGASTAT GATAAAAXGT TTAAATCTCA TTTEGGTTACC 5220 

TTGIUSTOCrG OAACACQCIUS TAACTGTCAT GCTATAGTCA TCATCTGTAT TTGGCTGGGA 5280 

ATAC2^2yVTGA AGATTGTGGT GTATTCAAGC ASTAOGGTTT TTGCTTTTeT TJTTGTITTA 5340 

GTGC3CAACRA AACTTTTTTT TGTCTGACTA CATTAAAQAT AAGACTGACT ATATTTATAC S400 

AACAGAAACT TTGTAATAGA TTTTTTCAGC TTTGTGAAAT CGAATTTTTT TTCATCAGGG 5460 

20 CTOGTTG6AT TTOCTTTTTA CCCTGTAATC CAAGOGTTAA TAtSTTlSTTA GAAGATQOGT 5520 

ZATTGCATGT CaWTCTTTTTT TTTTT Q TAAA ATAAAAACAT ACCTTAC 55€7 

Seq ID KO: C1B8 DNA Sequence 
Nucleic Acid Accession #s IJM_014324.l 
!Z5 Coding sequencei 89.. 1237 

1 11 21 31 41 51 

1 I.I 1 I I 

GGCGCCGQOA TTGGGAGGGC TTCTTGCAGG CTGCTQQGCT GGGGCTAAGQ GCTGCICAGX 60 

30 TTCCTTCAGC GGGGCACTGB OAnQCGCCAT GGCACTGCAO GGCATClCOG TC6T6GAGCT 120 

□TGGGGCCT6 6CCGCGGGGC GTKTCTGTQC TATGGTCCTG GCTGACTTCG GQGCGCGTGT 180 

GGTAGGOGT6 GACC3BGCCCa GCIGCOGCTA OGAOGTGAGC C8CTIGGG0C GGGGCAAGC8 240 

CT0C3CTAQTO CTGGACCTGA AGCAGCOGCQ GQAOC3CGCX3T GCTGCGGCGT CT(?TGCAAGG 300 

GGTC3GGATGT GCTCCTGQAO CCCTTCCGCC GOGGTGTCAT QQAQAAACTC C3WCTGGGCC 360 

35 CAGAOATTCT GCAGOGQGAA AATCCAAGGC TTATTTAIIGC CAGGCTGAGT GGAXTrGGOC 420 

A6TTCA6GAA AGCTTCTGCC G&srCMSCXGG CCAOSATATC AACIATTTQG CTTTGTOVdQ 460 

TOTTCTCTCI^ AAAATTGGCA GAAGTGOTQA 6AATCCGTAT GCCOOGCTSA ATCTOGT08C 540 

TGACTTTGCT GGTGOTQGCC TIATGTGTGC ACTOGOCATT ATAATGGCTC TTTTTOACCG 600 

CACAGGCACrr GACAAGGGTC AGOTCATTQA TQCRAATATG GTGGAAGGAA CAfiCATATXT 660 

40 AAGTTCTTTT CTSTOOAAAA CTCA<3AAATC GAGTCTGTGO OAAGCAOCTC 6AGGACA0AA 720 

CATOTTGGAr G6TC9GAGCAC CTTTCTATAC GACTTACAGG ACAGCACSATG GGGAATTCAT 780 

GGCTGTTG6A GCAATAlGAAC CCCAGrTCTA OGAGCTGCra ATCAAASGAC TTQGACIARA. 840 

GTCTGATGAA CTTCCCAATC AGATGAOCAC GGAKSATTGG CCAjGAAATOA AGAftGAAGTT 900 
TGCAGATGTA TTTQCAAAQA AGAOGAAGGC AGAGTQGTGT CRAATCTTTG AOGGCACAQA 960 
45 TGCCTOTGTa ACXCCGGTXC TGACTTTTGA GGAGGTTGTT CATCATGATC ACAACAAOGA 1020 
ACGGGGCTGG TTTATCAGCA OTGAGGAGCA GGA0GT6AGC GCCGGCCllG CACCTCTQCT 1080 
GTTAAACZUZG CCAGOCATCC CTTCTTCCAA AQGOGATCn 7TCATM3GAG AACTkC&CIGA 1140 
GGAGRTACTT CJAAQAATTTG GAtTCAGCCG AGAAGAGATT TATCRiSCTTA ACICAGATAA 1200 
AA1CATTGAA AGTAATAAGG TAAAAGCTAQ TCTCTAACTT CCAGGCOCAC GGCTCAAGT6 1260 
50 AATTTQAATR CTGCRTTTAC AGTGTAGAGT AACACATAAC ATT6TATGCA TOGAAAOWTe 1320 
GAGGAACAGT ATTACAGTGT CXZTAGCACTC XftATCAAGAA. AAGAAXTACA GAGTCTGKIT 1380 
CTACAOTOAT 6ATTGAA1TC TAAAAATGGT XATCATTAGG GCITTTGATT TATAAAACXT 1440 
TGGGTACTTA TACTAAATTA TCGTAGITAT TCTGOCTTCX; AGTTTQCTTG ATATATTTOT 1500 
TCarATTAAG ATTCTTGACT TATATTTTQA ATGGQTTCTA GTGAAAAAGG AATGATATAT 1560 
55 TCITGAAOAC ATCX3ATATAC A^CTIATTTAC ACTCTTQATr CTACAAT6TA GAAAAIOAjaG 1620 
AAATGCQ^ AATT6TATOG TOAXAAA1U3I CZUCGIGAAAC AGaSTGATTQ GOrXGCATCCA 16 BO 
GGCCTTTTGT CrTGSTGTTC ATGATCTOCC TCTAAGCACA TTOCAAACTT TAGCAACAOT 1740 
TAICACACTT TGTAATTTGC AAM3AAAA0T TTCACX:TGTA TTGAATOVQA ATGCCTTCAR. 1800 
CTQAAAAAAA CATRTCCAAA ATAATGAGGA AATOTSTTOQ CTCACTAOGT AGAGTCX3lfiA 1860 
60 GOGACAGTCA GTTTTAGC3GT TGCCTGTATC CAGTAACJECG GGGCCTGTTT O0C0GTGG6T 1920 
CTCIGGGCIG TCASCITTOC TTTCTCCATQ TOTTTOAXTT CIOCTCM3GC TGCKEAGCKAG 1980 
TTCTGOnTCr TATACCCAAC ACACAGCAAC ATCCAQAAAT AAAGATCTCh. GGhOOOCCCA 2040 
AAAAAAAAAA AAAAAAAAAA AAAAAAAA 20€8 

65 Seq ID HO: C1B9 JSSA Sequ«xice 

Nucleic Acid Accession fts XIV_091332.1 
Coding Bequence? l.«14ai 

„^ 1 11 21 31 41 51 

70 I I 1 I 1 \ 

ATGCUAGOT GQACACIGTG GGCIOCAGCC TTCCT^USCC TCCACTCIOC ACAGGOCTTT 60 
OCACAAACAG ACATCAGXAT CAGTCCAGCC CTGCCAGAaC TGCCOCTGCX: TTCCCIGTQC 120 
CCCCTGTTCT GGATGGAGTT CAAAGGCCAC TGCTATCGAT TCTTOXTCT CRATAAGAOC 180 
TOOGCTGAGG CCGACGTCTA CTGTTCTGAG TTCTCTGTGQ GCAGGAAGTC OGOCRAGCrG 240 

75 GOCTOCATCC ACAGCTGGGA GOAOAATGTC TTTGTATATG ACCTCX3TQAA CAOCIGTGTr 300 
GCX:GGCAIGC CAGCTGACGT CTGGACAGGC CTTCATQATC ACAGACAGGA AGC3GC3W3XTI 360 
GAA30GACT6 ATOaClCATC CZA-nSACTAC AGCTACTGGG ATOGCAOOCA GCCAGATGAT 420 
GGCGTCCAjGG CGGACCCAGA AGAAQAtSSAC TQCQTGCAGA TATGGTACAG QCCXACCAGr 480 
GAGCAGCTAC AGGOCCCAGA GCCCCAGTTA OCCTTATCAA TCTCAGAGGC CACAGATGTC 540 

80 TATCrOCCTG AGQATTXCCC AQCTGAGCCC AAGCTCATQG ACCAOTCXZTG GGlGTOCRGG 600 
AAGAGCCTQA AftCCATCXAA GAOTCATCTT ATQOAGCSCAC CCACTOCAOT QGCCAAGCakC 660 
CAAAAOGCAA AOACCOGACA TAOGAGGCIG OBCG G CgPCT GGTBaOCATC AGGVAAGGCI 720 
GGGTO^IXSGA AAGAAAGAAT OAAXGCAGAC TAGGGGGQAA GAAAGOGAIC OaCCCCanGG 7BO 
CnGGAAOaCC GGCTOCGGTG CAOOQAGCGC OGCCTOGGGG CTGCTTGOGG CCACSGGTCGA 840 
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CCCGAGGQCC AGCGCAAGCA DCGOCAACAG GftGOSCCAGG AGW3AGGCTG GGAAQAACTG 900 

GGAOGOGTGT CCCCAATGOS GGGCXK3CCAA, GCGTGGCRGC A0GG6CTGQG AGCGGGOftGC 9fi0 

CAGCGGQGTG C3GGGGCOC3GA GTGCG6GGAO aACCACCRQG CGC3CGGAATT GGGOAGCROG L020 

TC3GAGGGOGC AQCGGCTCCA GCCCCAGAJCC GCCGCGCTCT GTCACTTTQC ATTAAGAAAG 10 BO 

5 CTTCXXSGGGA ATGCACACGG CCTGGCGGCC GOCTTOSTGC AGCCCGCCCT GC3kSGTGCM3 1140 

(SAAGAAAAGA. ATAATCGCAC OOGTTTCTCA GGTGCTTACT TCACC&TQTC CXaATOCGAOQ 120D 

TGTGACCAA6 ATAGCAAGGA 6CAGTCTTTA ASSSGQACACG GCAGAGAlGGC JU3AAAAAGAT 1260 

GGGCCTTACC GGTTAiTCTAA GAAAAAAAiGA GGAGClGTtG GCTGTCCCTC TAGCTITGAA 1320 

CTACaAAGTG GAGGQGAAGT TTGTCTGGAT TTTGCTCrTMS AAOTGJiGQGC ASGGACCTOG 1380 

10 AT3X3CTCGAG AACCTCCATA A 1401 

Beq Jjy VOi CI 90 DMA Sequence 

Nucleic Acid Accession #s X»_054869.2 

CcMllitg sequex&ce: 26. .2902 

1 II 21 31 41 51 

! { 1 1 ) ) 

TAGAOGCGQA GOCCAAQGaG GTAAAATGCR CACTTGCT6C CCOCCAGTAA CTTTGGAACA 60 

GGAOCTTCAC AGAAAAATGC ATAOCTGGAT GCTGCAGACT CTAGOGrTTG CTGTAACATC 130 

20 TCTOGTCCTT T0C3TGTQCAG AAACCATCGA TTATTATGGQ GAAATCTGTG ACAAXQCATG ISO 

TCCITGTGAG GAAAAGGACG QCATTTTAJUi TGTGAGCICT GAAAAGGGGG GGATCATCAG 240 

TCTCTCTGAA ATTAGCCCTC CCCGTTTCCC AATCTACCAC CTCTTGTTGT CCaOAAACCr 300 

TTTGAACOGT CTCTATOCX3^ ATGAQTTTGT CRATTACACT GGGGCTTCAA TTTTGCATCT 3 CO 

AGOTAGCAAT BTTATCCftGG ACATTGAGAC CJGGQacrTTC CATGGGCTAC GGGGTTTGAG 420 

25 GAGATTGCAT CTAAACAATA ATAAACTGGA ACTTCTGOaft. GAXGATACCT TCCVCQQCIT 480 

GGAQAACCTG GAGTACCTAC AGGTCSSATTR CAACTACATC JiGOaiCRTTe AftOOCR MGC 540 

TTTTGGGAAA CTGCATTTOT tTGCAGGTGCT TATCCTCAAT GftCAATCTTT TGTCCMXnrT 60O 

ACCCTkACAAT CTTTTCOGTT TTOTGCCCTT AACGCSMrTTQ QAlCCTCGGGG GGAACOGGCT 650 

GAAACTTCTG CCCTACGIXSG GGCTCTTGCA GCftCATQGAT AAAGTTGTGQ AGCCACAGCT 720 

30 GGAGGAAAAC CSCTTGOAATT OTTCTTGTGA GCTGATCTCT CTAZIAGGATT GOTTGGACAG 780 

CATCTOCTAT rrCAGCCCTGG TGGGGGATGT AtmTGTGAG ACOCCCTTCG GCTTACftCGG 840 

AAGGGACTTG GACQAGGTAT CCAAGCA06A ACTTTGCCCA AGGAGACTTA TTTCTQACTA dOO 

OGAGATGAGG CCGCAGAOGC CTTTGJVGCAC CAOGGGGTAT TTACACACCA CCCCGGGGTC 960 

A6TGAATTCT GTGGCCACTT CTTCCTCTGC TGTTTACRAA COCCCTTTGA AGCCCCCXRA 1020 

35 GGGGACTGOC CAACCCAACA AGOCCAGGOT GOGCCCCACC TCTCQGCAGC CCTCTA AGGA 108O 

CTTGGGCTAC AGCAACTATG GCCXX3VGCAT OGCCIATCAG ACCftAATOCC CGQ3GGCITT 1140 

G^XMynaiCCC ACGGCGTGCT CTTGCaUVCCr QCAGATCTCT GATCTGGOCC TCAAGGTAAA 12 oo 

CTGCCAGGAG CGRAAGATC3G AGAGCATCGC TGAACTGCAjG COCAAGOCCT ACAATCCCAA 1260 

OAAAATGTAT CTGACAGAGA ACTACATOiC TGTCGTGOGC AGGACAGACT TCCTGGAGGC 1:^20 

40 CA0GGGGCT6 GACXmrCCTGC ACCIG(3GGAA TAAQOSCATC TOGAIGATCC AGGACOBOGC 1380 

TTTCGGGGAT CTCAOCAACC TGAGGOGCCZ CCACCDSAAT GGCAACIHQGA irOSaG»GGCT 144D 

GAGCCOGSAG TTATTCTAaXS GCCIGCAGftG CCTGC3M3TAT CTCTTOCTOC AGTACAATCT 1500 

CRTCC3GC3QA6 ATTCAGTCT6 GAACTTTTGA tXIOGGTCCCA AACCTCCftGC TGCTATTCTT 1560 

GAATAACAAC CTCCTGCRGQ CCAKSGCCrC AGGOGTCTTC TCTGGCTTGA CCCTGCTCAG 1620 

45 GCTAAACCTG AGGAGTAACC ACTTCaCCTC CTTGCC2U3T6 AGTGGai3TTT TGGACCA6CT 16QO 

GAAGTCACTC ATCCAAATCG ACCTGCATGA CAATOCTXOa GATTGXACCI GTOACATIQT 1740 

OOTCATtaARO CTGTOQGTGG AGCftOCTCM ASZGG60GTC CTftOrrGGACJG A6GTGATCT6 180O 
TAAGGOGOCC AAAAAATTCQ CtGAGACOGA CATGOGOTOC ATTAAGTOSG AGCTGCTGTG 1860 
CCCTGACTAT TCAGATGTAG TAOTTTCCAC GCCCACACCC TCCTCTATCC AGGTCOCICC 1920 
50 QAGGAOCAOC GCCGTSACTC CTG0G6TC0G GTTGAATAGC ACGGGGGCGC OCGOaaGCXT 1980 
GGGGGCftGGC GGAGGCSGGGT COTOGGrGGG CnGTCTOTG TTBATTCXCA GOCTCCTGCT 2040 
GGTTTTCATC ATGTCGGTCr TGGTGGCOOC aoGGCTCTTC GTGCTGGTCA TGAAGOSCAG 2100 
GAAGAAGAAC CAGRGCGACC ACMCAGCAC CAACAACTCC GACGTGAGCT CCTTTAACAT 2160 
GCftOXACAGC GTGTACGGCG OOGGCGGCGG CAOGGGOGGC CACCCACACG OGCACGTGCA 2220 
55 TCACOSOGQG CCX^GCSCTGC CChAGGTGAA GAGGCC3CS3CG G6GCAOGTGT ATGAATACAT 2280 
CCCCCAOCCA CTGGGCCACA TOTQCAAAAA CCOGATCTAC CQCTCCCX3AG AGGGCMCTC 2340 
GGTAGAGGAT TAOUUUGACSC TGCACX32UQCT CAAGGTCAOC XACAGGAGCA ACCAO C A P CT 2400 
GCAGCASCAG CZ^GCAGCOGC OGCOGOCACC GCAGCAGCCA CAGCAGCAGC COCGGGOSCA 2460 
GCTGCAGCTG CAGCCCGGGG AGGAGGAGAO GGaOGAAAGC CACCACTTGC GQAGCCOCGC 2520 
60 erACAGCGTC AGCACCATCG Af3C3X3C3GGGA GGAGCTGCTG TCGCGGGTGC AGG&OQCQBA 2580 
CCtaCTTTTAC AOGGGCATTT T»GAACCAG». CAAACAGIGC TOCACCACCC CGGCOGGCAA 2640 
TAGOCTCOCG GAATATCCCA AAT3XXXS3TG CAGaXOGCr GCTTACaCTT TCICCCCCAA 2700 
CTATGACCTG AGAOGOCCCX: ATCTQTATTT GCACOOGGOG QC3y3C3GGR)CA GCAGGCTACG 2760 
GGAAIxnOTG CTCTACfiGOC CCCOGAGTGC TOTCTTTGIA GAACCCAACC GQAACXaiATA 2820 
65 TCTGGAGTTA AAAGCAAAAC TAAACXSITGA GOOGGACTAC CTOG AftSTG C TGGAAAAACA 2880 
GACCACtSTTX AGCGAGITCT AAAAGCAAAQ AAACTCTCTT GGAOCXTTTG OkKSXAAAAC 2940 
AAACAAGCIVA GOUiRCKCAC ACAGTGAACA CATTTQATTA ATTGT6ZTGT TTCRA09TTT 3000 
AGGGTGAAGT GCCTTGGCAC GGGATTTCTG AGCXTGGGTG GRAGATACSA AAAGGGT6T6 3060 
CftATTTOCTT TAAAATTTAC ACGTGGGAAA CATTTGTGTA AACTGGGCAC ATCACTTTCTP 3120 
70 CTTCTXGCGI GIGGGGCfUGG TGTGGAQAAG GGCTTTAA9G AGGOCAA3TT QCXQCBiQiSaG 3180 
TGftCCTiGTGA AAGGTCACAG TCATTTTTGT AGTGSTTGGA AGTGCEAtUSA ATGGIGGATO 3240 
ATGGCAlSAGC ATABATTCTA CTCTTCCTCT TTTGCTTCCT CGCCCTGCCC OGOCKSCTGOC 3300 
CCAOCTCTCT TTCTCCCCTT TTAAGOCATQ GGTGGGTCTA ACTGGCTTTT GTOGAGAAAT 3360 
TAGCACAOOC CAACTTTAAT AGGAAATTTG TTCTCTmT OOGCOOCTCT CCITCICXCC 3420 
75 TGCCCTCCGC TCCCTTCTCA TTOCTTTTCT TTGTTTTTAA AGGAT6XGTT TGTATGCATT 3480 
CXGGACATTT GAATTAAAAA AAAAOIAXTG TQATCCTGXA AAGGAriCMX: ATASAXGTGG 3540 
ACAAATCATT AAAATTACAG AGCTATATQA TCCATAATTB AnnSTCAAA ATAACTTATT 3600 
GATGAAATAT ACAAATAlTT TATTOTAGCA CCTATTTTTA TATGCAGATT TAGCAT^CCCT 3660 
CTTXCCTTCA CTATTTAGCC TATGATTXTQ CAQAGGTOTC ACACTGTATT AGGATCTGCA 3720 
80 TTTCTAAAAC TQACGTGGTA TCAGGAAGGC ATTTTCAATC ATTCAAAATG TGGAGAATTT 3780 
AASGOCTAAA TCTXXAAAAG OCAATGGAAC CCACCCAATT GAKTCTGCAT TTTCTTTTAA 3840 
GAAAACaUSAG CTGATTCTAT CCCAAT6TAT TTTAAAAAAT AGQGOUVTTG A3C TOGGCCft T 39O0 
TCCOniGnGAA TTGTTTGCAA GTTITGGCTT TTATTAGAAA ATATTTGAAA OTATlllIAT 3960 
TRATGAACCA AAATGACATQ TTCATTTGAC TACTATTQTA GCCGATTTTC eATTOITTAA 4020 
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CCAAAOOCAG TTGCATTTGT ACAQATCCSM: GTCTACTSGC AGC:TCftQAA0 AOOi&ATCftT 40BO 

GGACrOTACA AQTCTCTATA CAATGTCTTT ATCCSCTBTGG GCAGCAAGCA ATGATGATAA 4140 

TOACRAACAG GATATCTGTA AGATGOOaCX ACIGTrSTTA CSUSTCTCATA TGTATCCCAG 420 D 

^ CACATGTAAT TITTTAAATA GTTTCTGAAT AAACACTTGA TAACTATGTC 4250 

8eq ID NO: C191 DHA Sequence 

Nucleic Acid Acceseian )t; >n<I__000793 .2 

Codizog sequence! 401. .122Z 

10 L 11 21 31 40. 51 

\ \ \ \ ] \ 

accrracKQRG aijaggcactt tgcaccacrg acagatagca AOAAiOGeAAA gacagagagt eo 

GAGAAAAAAG AGGAGTCAGT CGCTOCTGCa QftAGGQASAfi AGTGAGACTQ BGAiQAAAGAG 120 

AAGCACAOAA AQTGTGTQTA AAACGGAGTA AAGAAA6AAA AA7VAAAAAAC TACCCTTAAA IQO 

15 GCACATTTAA AAAAAAAAAA CTCTGGCAAT TCAAQAAAGA AAGAG6CTAC GTTTAAAGAG 240 

CATAOAGACA ATQAAAGGCT AAAGAAAATT TTAAAATCTC TQCCACftGTC TCATAGC3TQC 300 

TTGGAAATGA AAGTAGAACT GOCTGTCTTT AACGGACTCT GACAGAGAGO GTGAAGGGGA 360 

ACCAQAGCGC ACAA(SGQAAC TOACTCftGGA GGCAGAGAAQ ATC3GGCATCC TCAGOGTAOA 420 

CTTGCTGATC ACACTGCAAA TTCTOCO^QT TTTTTTCTCC AACT<3CCTC!T TCCTOGCTCT 480 

20 CTATGACTCG QTCATTCTQC TCAMXAO&T GGTGCTGCTG TTGAGC3CGCT CCAAOTOCAC 540 

TGGCX3GAGAG TG6CG60GCA TGCTGAOCTC AQAOGGACTG GGCTGOGTCT GGAAGAGCTT 600 

CCTCXZTOQAT GCCTACAAAC AGGTG2AArT QGGTGAQGAT GCCCCCAATT CCAOTGTOCST 660 

GCATGTCTCC AGTACACSAAa GAGGTOACAA CAGTOGCAAT GOTACCCAGG 2V3AAGATAGC 720 

TGAGGGAiGCC ACAlGCCACC TTCTTGACTT TCCC3«3CCCl? GAGCGCCCAC TAOTaorCaiA 780 

25 CTTTGGCTCA GGCACTTGAC CTCCTTTCAC GAGOCAGCXG CCAGCCTTCX: GCAAACTGGT 840 

OGAAGAGTTC TGCTCAGT66 CKSUrTTOCT GCTGBTCCAC ATTGATOAGG CTCATCCATC 900 

AGATGGCTGQ GCGATAOCGG GGGACTCCTC TTTGTCTITT GAGGTGAAGA AGCACCAGAA 960 

CCAGGAAGAT OGATGTGCAG CAGCCOWSCA GCTTCTGGAG CQTTTCTGCT TGCCGCCCCA 102O 

GTGCGSR£5TT GTGGCTGACC GCATGGACAA TAACQCXZAAC ATAGCTTACG aOOTAGCCTT lOBO 

30 TGAAOGTGTG TGCATTGTGC AOAOACAGAA AATTGCTTAT CTGGGAGGAA AGGGCCOCXT Z140 

CtCCCRCMVC CITCAAGAAG TCOGGCATTG GCIGGAGAAG AATTTCAGCA AGASAtGAAA 1200 

GAAAACTAGA TTAGCTGGTT AAAGSEA.TGA TTATAAGAGA GCTTArriGTT TTAAAAAGTT 12€0 

ATATAAAGGC AAGGAAATTA AGAACTGAAT CCRTATTTCA ACAGAGCCCT ATTOaCTTAC 1320 

- TGAAAGACAG GAGTTTATCT ATCOGAAOAA CATGAATCTC TAACRGCTOC ATACTTCTTT 1380 

35 CACtAiC3<»A AtGGCAXTGS GCTGAGTAAG TAACCATATC ACCTCTCTTC TTAGTAAAAA 1440 

GC0CTATC3TG AAAAGATCCC A/^TGGRGA GGAAGAAAC3S CTAASTCAGC ATGTSTTCAT 15O0 

TCPGCftartGA GAAGGAACTG ATACATCTOA TGCATEGCTTT GAGACCAGAA GAAAAEACTT 1560 

ACCTQAATAA TTACTACATT AQQGiiAGCrA CTGTCTACGT TAAGATAAAG GGTATTGCCT 1620 

TOGCTCTATT TGGCATGGAT GGAGCCCAGT TaORRAATTC CCAAAXATTA CAACAAGTCX: 1680 

40 TTGAACCCAG GCCATGTOGT TAGAOSTTOG TGTTAAK3?rT AGAiCCrTATG TTAGAiGTCAT 1740 

XTCTGAISIT CC2U5CTTCTA GCCATOTW3T OCTCTCAGTC T TCATA COCC ASAAATTATI IBOO 

GGTATATTTG TAQATACCOA GAATGATGOC TCAGTCTGAO ASGTRtAGAAT GKTCATCTGT I860 

AATCTGAGGG TTAATTTCTA GGCAGOTGGA GAGAGTGGTA AAAAAGAAAT GAAATTGACA 1920 

AGCrAGOAAA GAGGAGGCRfi AAAGATDTQG AAAATTCACA GAGTrTCAOC CTTAAGCXQT IMO 

45 AGnSAGTGGG 'XCACATTTGI TA60CA0GGA AACTTAGAAA CATACACAAG GCCAGAAAAA 2040 

GAAGAAGGAG CTOUICTAAA AOTGOCATAG ABAKIACACA TATAAAAAGA ATATKCTI<GT 2X00 

CATATGCTCG TAGAGAGGAG AAAGQGGTQA ITGAAAGAAA AAAAAATACT TAAATATITG 2160 

TAATTOTQAG GOOTTTCTTT TGOAAAIAAT TACTTTTOAA OCATGTATGT GGTATCTATA 2220 

TTTTCRGTGG GTTAATTATA CCCCATQATA CCIATTAT^ GRARACCAGT GGGTCIOSTG 2280 

50 QracTsarcr tttcctcccc attcctacaa tttctatotq qexicaagtca ttcctaatct 2240 

TGGTCrCTAT AGCAGTCTTC TCTCTGAAIG CTGAOCIGAA GAAATTATAC GTAdUtACAC 2400 

ACKEACAiTAC AlftC3VlACAA ATATAT6TAT ATATATTCIC AGCIGCTGCG GGAGGTAGOT 2460 

ACCATOGCCA TTCAGCACAG CCTTQATTTC CTCCXMAGT AQOTGAGCTA lAGTGAAGAA 2520 

TAGGTGCRRA CAAACAAGCT TACTTCCATT GCftZUiATAGA AGAAGAGGAA GTTAOaGATA 2580 

55 ATTCTQATCA ATCATTTTG6 AGGCITtCGTI ATAAGGCAAC CC3C00GXATA TCA JGGAA TT 2640 

TOCATTGACA TTTGAATITS GACTTGGATC TTCOCTTGGT COCATTAGCT GftCSGTfTAGT 2700 

AATCTAAAOT OCXrEATASTA TATGATTATA ATGCIATTTT AAAAAATA!CA TATATAAAAT 2760 

ATTTTTTTCr TTTTAAAATA 6ACACTAT2W3 TTTTACCCAT AAGTAATATT TAAAGATTAT 2820 

AGCrCCCSiAA AfiAATQGACC AACCACTTTC GTATCATAAT TTCTTTTTGG TAAATATGAQ 2080 

60 ACTATTATGA AATCATAGTA TATGATTGTA TTTAAAGGTA CAATCAAAOG ATCITTIGTC 2940 

GATTCCATTA AIAACIGAAT AAAAAATAAA TAAAATGGAT AGAAAAAAAC TAAAlSTTGAA 30 DO 

AATACATTCT rCAAACTAGTT GTCTGAAATG AGAAAAGAGT QAGAACTAGG TGTGCAAGAA 3060 

OCAAACGTAT TTTATTTTAT TTTTTAAATG GGAGC^^ACAT ATCAOTCOTG *FCACCRGCfSG 3120 
^ QTATATTGTG TAAATATTAA AGCTCCATTG GGACTOATTT T^TCATQGCAA CATCROCTTT 3X80 
05 CrAAIGTTCC AAATTCXA3A AAAACCACGC AiCAAAGAAAC AAAGC3MIT TCAT1ATCTA 3240 
ATOAGTTGCr GGAAAAXCAT ATTGA5AATA AT7ATTTCAG ATTOCTCAGT TGTTAACTTC 3300 
TACATTC3^ GGCTTATCTC "XXSCCCCCKTS GATTTTTAAC CICAAAAXGG TGTGAGATTT 3360 
ACT6TGGAAC CCTAAAGCA6 TAAAATAAAA AACXTGGTTG CAGCACATTC ACACTOTTOT 3420 
' rt/V OCXTAAAATT CCCCXTTTTT CTCTATQTAC GATAAJVGTAA CAOTATGTCA GATAAGCCX3G 3480 
70 TGGGGGGATG AGATTAGGCT GAGGCAGTGC TAGTCAAjCTCG GGGGAAAAGS ATGATGOAAA 3540 
AATCACCCAIS TCGTGCIATA TTTTTAAAfGA AGGAGGTGGT TTATGTGTGC AGACftATTCT 3600 
OCCTGAG6TT AGCCCAKTGS ACSAAATGAAG CAGAiSaAAGQ AAACATAGAA AOACA^GGGC 3660 
TATCAGGQAG GAAGATGrXC AATAQAACAT GCAAGAATTT CTGGAAGA2^ OGCTGIGGAA 3720 
GGGCCAATQO AGAAAATGAA TGGACAAAGC TCAGGRATCC CTACGCTATG TAGAATGTTC 3700 
75 TTGGTGTTAT CAGGGTTAAG CCCXGTAAXT AtEGTAACCTA TTTATCGGAA CATGAATITT 3840 
TATGAITTCT IGTGATGTAT TCTTTTATOA AATTAACAAQ AACICATTAT TTT3AGaTAO 3900 
AGGRAAAXCA ATGCTTTATC TGATATGCTG AGAAATTATT AGftTieOCAA XACTCATGTG 3960 
OGTTTCATGT GTTTTATAAG QTTTGTTCCT TTQAAGRAXT GTAGTTCTXA GTCCCACAGS 4020 
GAAATffTGTA TCTATTTATA TATCATftfiTA TAAATCTATG ATAIATTTAT AXCATATATA 4080 
OO AAAaTCi:6AG TTCTCTTTCT TAOTtXXZTAA TGATGTTTCI CCCATAGGCT GTGTTTACAT 4X40 
GGAGCIATGG GTTTAGCCIT TTAAGCTTCA TTAGCITCTC 13VITATTGAA ATAQTTTCCA 4200 
AG%AATTXTA GATATTATCA TAAChTCTGG GTCTACTCAA ACS^CTTATTG T7TGAAAGAC 4260 
TTATGTCrXG GACCSATCAA AAACXGAlCTT TATTTATTGC TlAGlGAAAA TACTABTOaS 4320 
ATCAACAA!FG ATTTTCTTGA ATGGOCATGA ATGGAGAXGC COGCACAGTA ATGTAGAAAT 4380 
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5 

10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



GTTTCATACA 
ACTTTGTATA 
CTCTGGICCX 

cttcdcagtg 

CATCACAGTG 
GGGCTCTGAA 
ATAAAATTCT 
A66QTGGGAT 
AAGftATTGAC 
OSACATTTGT 
TATC3ACCRTT 
AAAACCTTTA 
TGGAGACTGC 
TAAATGGTAT 
TCACTCTAAA 
TACCAATCTC 
AGAGAGGAIC 

GAAOGCTGCT 
ATGCCCTATG 
CAAGCfTCAGT 
TGCAAGTCGG 
CAATATTTAC 
AAAIXCTCCA 
AACCCACAGA 
CACTGGTC2«3 
AQTAJCTTTAT 
TTAGTTTTGC 
GAAOAGGOAQ 
AAGAXSTCATC 
TTTCTCTCCC 
GTC3AGAATGC 
GCTTTaaOAG 
TTATAG3UVIT 
AA2UUUIAAA 



GCTATTAAAA 
GCTAAjCtTQAC 
GTGTCTTCaC 
ATCTGrrCAQ 
OBAAGAATA6 
ATAAAAATTA 
AGGAGGGCAG 
TACAAGGGTG 
TGATTTACftG 
TCCaCCOGAC 
QAAAAAOGAA 
CTAGCATTTA 
AAGTAAGGCT 
GGCCAAAAST 
AATAGAATCC 
TGCTTTGCftA 
TftOGATGGGA 
ACCAATCTGG 
GCAAGTCATT 
TGTA*rAGrAC 
CTQAACCAAT 
GGCTGGCTAC 
TOAQACTTOS 



AGGGGATGoa 
CQTGGTTTTT 
AACCAAAGCA 
CATCACATTG 
ACRGAGATGT 
GACCrCAQTT 
TGiBCAAGGftG 
AOGCCTGCAT 
A1TATCAGTA 



TOTAACTGAC 
AGTCACTTAA 
CrCATTTATA 
TTAAGTTCTT 
OCrATTGTCT 
TOAAATATGG 
GTTAtSQAGAC 
TTCCTCAfiGC 
GACTTCTCTT 
CTCTGACTGA 
l^AATGTAGAC 
GAGCTTTTCA 
TTTAAITTTA 
C&GAGTTAAA 
AAACCCACTC 
TQOGCACAAT 
6AGCTAGAAA 
GAAGATTTGA 
GAGTGACTTT 
CAGAAGCAAG 
AGAAAGCAAA 
CCGTCZTAQQ 
AAGACOCAGC 
TCCCCTCRAA 
AAATAAAGAA 
ATGTGTATTA 
ATTAAATi3AT 
TCAOOCAGAC 
GCCAGAGTTG 
AGTGGTTGGA 
AATATGC6GG 
ATGCTACACA 
GAAAGAQTGI 
TTTTAATAAA 



CTCCTTAGAG 
CTTACATOAC 
GC3\CGTCTCC 
CrCCCGTTAA 
rTC!ATTTTGC 
TGAGQTCACA 
AGTnCATGTAT 
ATGCCCCTAT 
TATQTCAATC 
TGGTTTGGAA 
TCTQACTTCC 
GAACATCCCC 
GOAGGTTTTT 
ATATATATAG 
TTCATATATQ 
CTTGGTCATG 
GTtTGCTAACT 
AAACAAACTT 
AGGATGAGCA 
GTCTCftGACT 
CATGTGCAQA 
CAGCAAC/tfX: 
AGATGTrrAA 
AAGCCAACAG 
AATTCTCTCA 
GQATTGaaOG 
ATTGGGGTAG 
CTCACCTAGC 
ACXTCAGTOTG 
TGTAGTCACA 
ACATQATOCT 
TATGTGCTTC 
TATCATATTG 
CTTTGAATAA 



6CAGATTRGT 

TTGATTTTlG 
CCAOQAAGTG 
CIGAGTGTAT 
TGTTQGTGCT 
GGCCTXTCGG 
GQGCCCTATG 
TTAAOAGGAT 
AATAACXTTA 
GTCCX31CTGA 
ACTGTCATGT 



TTAGATTCGA 
CTTOCAGAAT 
TCCTGAGGCr 
GGGAAGAACA 
CTCGCAACTG 
AAACATTOGG 
TAAC2U3ACCC 
TATCCAAACA 
AGAGGTCCAG 
TGAAGTCACT 
6TAAACACAT 
AGACXTCTCC 
ATGTtBU^GAA 
GGAATGTTGG 
COCAAQXAAT 
OGGATGATAA 
TrAGTTTGCC 

aagaggcx:ts 

TCAGTTGCAG 
OTQCIGAGTG 
AAGAATAAAA 



AACTGTTCCr 
CACATTGGGT 
arAGTATCAA 
CTTATTCTCT 
TTTACTATTT 
GCCITGCTGC 
QAAAATTCAA 
TGGAJVGCAAG 
GGATGAATCT 
ATTAQOATCA 
AGGATTAATG 
GTCTCAGCAG 



ACTTCCTCCT 
GGGGCTTAAG 

AGGCCCTGAG 
AAG6AAC3GCT 
CXIACTTCCTA 

AGACTGCICA 
GGAGCTTATT 
ATTTTGGCTPC 
AAATGAAAGA 
AGGCCCATGT 
ATAAGTATGC 
CCAGTTTTQT 



CTACTGAC6A 
TCTCCCCATC 
G6TAAATGT6 
AAAATOAACr 
CTAT6T6TGC 



Seq ID mOt C192 SNA sequence 
Niiclelc Acid Accession Uz NH_006549.Z 
Coding aeqizi^ces 824.. 2590 



1 
I 

G2^cciCGaa 

CRACACAGAG 
AAAAAAAAAT 
AOCCCAAGGC 
AACCCCATCr 
CCAQCTACTC 
GATCGGAGAT 
AASAAAAAAA 
TGTGAfSTffrC 
ACCCAQTTAA 
CTOGATTCCC 
GGTAGGACOG 
TTATGCAGCr 
ATGG6AGT6C! 
TAGCCAGOCC 
CAGCAGCGAA 
OCrQQGCATG 
CGGCTTGGGG 
OGGGTCCCAQ 
QGGTGGGCTG 
GCCCTACTCS^ 
I3G»GTCTGI^C 
OCrTGAAGGAT 
TGACAATAC3C 

crrxccAOGT 

CAGGGGGGCC 
CAAISTGGTG 
CTTCGRACTG 
AGACC3«3QCC 
GAAGATCATC 
CAAGATCX3CT 
CACC3QTGG0C 
CTCIGGGAAG 
OCAnTGOCCA 
CCTGGAATTT 
GCTOOACAAG 
CAOSAGGCAT 
GACTGAAGAG 
GGTGAAGACC 
GGAACGCTCA 
GTCCCXOTCT 
OOCCGQTGGG 



11 
I 

AGGTOSAGOG 
AGACOCTGTC 
GGGAGTGGGC 
GGGTGGATCA 
CTA06AAAAA 
GGGA6ACTOA 



CCTRCGGCCT 
CTAGAGGGCT 
XXTTGaCTQT 
TTCM3GGA6C 
GGOSGATdCA 
COCAGTGTGC 
AGCAGCAACC 
AGCCAGAAGC 
GAGTCCnCA 
C3GGGACCQGC 
GOOOGGCCXX: 
GCAGGCGGTG 
CCOGTCAQCT 
CACSGrTCIOCA 
GAAATTGGAA 
TACTATGCAA 
CGCCCTOCAC 
ATTGAGGAGG 
AAOCTGGTGO 
GTCM^CAAO 
CC3TTTCTACT 
CAOOGTGAGA 
GACTTTGGTG 
AGGGOOGCCI 
GCCTTGGATQ 
TTCATGGAOG 
CX3U3ACCAGC 
AACCCOGAGT 



21 
1 

TGCAGCGAGC 
TCAAAACAAA 
CXSGGCGOOOT 
CGAGGTCRGG 
TACAAAAA7T 
OGCAGAGAAC 
CACTCCAGCG 
AAAAAGGGAG 
AGAAATAOCA 
TTQAACCTTT 
T70GCAG66T 
CCTCCAGGTC 
ACGCACACIT 
TGGATOAAGC 
GGGCCGCCOC 
CCTGKSAGGG 

COCTGGAGGG 
ACCTCrCGQG 
QCAGOCTGGA 
CCCOSCAGTC 
TOWnSGGTAT 
AGGGCTOCTA 
IGAAGGTGCT 
OCXX3AGGCAC 



31 

i 

CGTOATOt^TG 
CAAACAAACA 
GACTCACACC 
AATTCAAGAT 
AGCCTAAQXAT 
TGCTTGAACC 

TOQGGGTGGA 
GAGAAGCACA 
TATTAACTTG 
CAGIQAGACA 
TTCDTTTCrC 
AAGGGTCCAG 
TGGOQCATGC 

ccaggatgag 
cctgcguggc 

OBAGTGTGAG 
CGHtGGCCPA 
TGQCAAGCTG 
CATGAAOGGA 
CTCGCCTCGG 
GCAGGhCTGT 
IGGIGTOGTG! 
GTCCAAAAAG 
C03GCCAGCT 
AATTGCCATC 



41 
I 

CTACTGCACT 
AACAAACAAA 
TGTAATCOCA 
TAGCCTGGAC 
GGTGGCCGGC 
TGGGAGGCAG 
AGCGAOACTC 
GCICTCATTG 
TOQGAACGGG 
GAGGTTGACT 
^rGCCCZGGOT 
CTCTTCOCOG 
CSAAAGIGGCT 
ACCATCTCAT 
CXGGGGGGCA 

OCOGGCIGTG 
GAflGTOCCCC 
TCrCTGCAAG 
OSCIGCATCT 

6T0CA6CIGA. 
AAiSTTQGCCT 
AAGCTGA1CC 
CCTGOBOaCT 
CTCAAGAAGC 



GAGGTCGAGA 
ATGATAGGTA 
CTGTCAOCQC 
GAGCTCAAGG 



AGSTOCIG S V 
<3GCCOaTGAT 
TCCAGGAKn: 

TGAGCAAIGA 
XCATOGCIUX: 
TTTGGGOCTUT 

agogqatcat 
ccg;vcatagc 

OGAOGATCGT 
OGTTSOOGTC 
ACTCAGTCAA 
AAiGGCrrCCTT 
CTGQAAACTT 
AAGCAAGGCA 
GXGCCCTTGT 



OGAAGTGOOC 
GAXCAAAGQC 
CAACCTCCTG 
A'naUkQQGC 
GGAGTCGCTC 
GGGIGTQACA 
GTGTTTACJW: 
TGAGGACTTG 
GGTGCXGGAA 
GGAGGATG2U3 

acacattccc 

i:gggaaocca 

GCrrCACCAAA 

gcsgaagacaa 
grgaggcagt 



AOCCTCAAAC 
ATCQAGTACT 
GTC3GGAGAAG 
AGTGAOGOGC 
TCTG3USAGCC 
CTATACTGCr 
AGTAAGATCA 
AAOQACCTGA 
ATCAAGCIGC 
AACIGCAOGC 
AGCITQGCAA 
TTGGAGGGCA 
AAACCAAGCA 
CCrCC3M3GGC 
GCCrGCX3l:GQ 



51 
I 

ocagcx:tggg 
caaaacsaag 

GCACTTKJGe 
AACATGQTGA 
GCCTGTAATC 
AGGTTGCAGT 
OGTTTCAGAA 
GCTOGrTTQCA. 
CTGGAAATOC 
CTCCTGTCAA 
CBCICQACCC 
CACAGTGCT6 
COGCTGCOGG 
CATGTGTCTC 
OGQGCnSCAS 

tgagcatcca 
ctgtqoacjct 

TTGACTCCTC 
AGOGGTCCCA 
QGCCQTCOCr 
GGO06ACAGI 
ATCAGSKIAC 
ACAATGAAAA 
GGCAGGCCOG 
GCATCCAGCC 
TGGAiCCACCC 
TGTACRTGGT 
C!ACTCTCTQA 
TACACTADCA 
ATGGGCACAT 
TGdCTCCAA 
GCAAGATCTT 
TTGTCTTTGG 
AGAGTCAGGC 
TCACOOGTAT 
ACCCCEGGGT 
TaGTOGOAaF 
CCGTGATOCT 
GCOGGCGGGA 
GGGAATGTGA 
AOCGACCQ3C 
AAAGTTGCTG 



4440 
4S00 
45G0 
4520 
4690 
4740 
4BQ0 
4B60 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
55Q0 
5fi40 
5700 
5760 
BS2 0 
58BQ 
5940 
6000 
6060 
612D 
6180 
6240 
6300 
€360 
6420 
6429 



60 
120 
IflO 
240 
300 
360 
420 
48D 
540 
600 
660 
'720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 



1320 



wo 03/042661 



GGOCCCOaCC OCOOGCTCCC CCGCACGCAT GCATCCACTG CGGCGGGUUGG AGQCCRTGGA 258D 

GCO062V5TAG CTGCClGG&T CGCTCQACCT GGCA.TGOQOG COGOGTOSCC TCTGGOGGGC 2640 

TGCTOCMTCG CGTTTCCATA GCRGCATGTC CTAOGQAAac CCAfiCACGTG T6TAQAGCCT 2700 

CGATCGTCAT CTCTGGTTAT TTCTTTXXTC CTTTGTTGTT TTAAAGGOQA CAAAAAAAAA 2760 

AAAAAAAGGA CTTGACTCCA. TGAOQTCQAC CXSTGOCCSGCT OeCTGGCTGG ACAOGCGGGT 28Z0 

GTQIViGGhOTT GCACACOCAA. AOCCACGTGC ATTTTGGGAA ATEGCTTTTT AAAASATTTT 2080 

TATGOCAAAA ATCCTTCATT GTGATTTTCA GAACCACGTC AQAIATAOCA AGTGUUnCTG 2940 

TGTGGGGTTT OACAACTCGTQ QAAAQGCGJM5 CAGAAAACTC CGGGGGTCTG AGGCCATOGA 3000 

GGTGOTTGCT GCATTTGAGA GGGAGTAGC3G OGCTAGATGT CGCTCCTAGT GCAAAC3CGGA 3060 

AACCATGGCA CCTTCCAQAQ CCQTGGTCTC AAGGAOTCAG AGCAGGGCTG QCCCTCAGTA 3120 

GCTGCAGGGA GCTTTGAT6C AACTTATT7Q TAAGAAGGAT TTTTAAATTI TTTATGGGTA 3180 

GAATTG7AIGT' CAGOAAAACA aAAAGX^GCTT GAAATTTAAT AAGTGCTGCT GGAACSGOGAT 3240 

TTTCCftAGCC TGGAAGGGTA TTCRGC3VGCT GTGGTGGGGA AACATTTCrrC CT6AAAGACT 3300 

GAAOGTGTTT CTTCATQACA GCXGCTCAAA GCAGQTTTCT GAGATAGCTG ACCXSAGCTCT 33S0 

GGTAAATCTC TTTGTCAAAT TACGAAAACT TCAQGGTGAA ATCCTATGCT TTCATGTACA 3420 

TTACATGQCr TAAOATTAAA CRAAAACATT TTTCAAQTCT CIAACTAGAS TBAACTCTAG 3480 

ACCACftGTAG TTCftGAAACT ATTTAOnSCT TCCAGGATAT ATTTCACAlGC TTCAGGCATG 3540 

TGATCAaTTA OAQCCOATGA AACCIATGCC CGC3CTGTATA TATATTAGCA GCTTftGCTiUS 3600 

TXCATAACCT GTATATTCTA AA(3ACTGCTA AGGTTTTGTT TTCRTmAA ATCCTAGCTQ 3660 

ATTGTTGTCO TCAATGAAAT AOCCAGTTTC TGAAGGQCCA GG^EGGGJiAAT GCTTTCACTG 3720 

GACCAACACA CAAATGATCA TCCTGAOOAT CZGAGCXTCC CTAQACTGCA. CACAATAACC 3780 

TTGGGGCACC CTTTTAGAdSA AGACTGTTGA AACCCA£^C ACTOGTTGGQ OTATGZUSGAA 3840 

ACCAGGGCTT GGCACAGOAA QTTCCCCTTT GTAGCTAAfiA GTCCAGAAAG AAAGGQTTCA 3900 

TCTTTTTGAC TlCCAACTGA TATTGGGAAG TTTQGTTGAG GTTCAAtJXOT t3ACTCCTTCC 3960 

AGAGOCACAO OTAOQGOAGT (ntShAGTTGA GOGGGAOGAA AGCTGGAAGG ACTCTGCCTT 4020 

OGQIU^TICXCC CnSCTCIGCT TTQCaOGaCl TGGTGGAATC TGGQCIOQGG AAAGACGGCA 40 BO 

COQGGAAACT CIGCTTCCCC ATT6TTTOCA TCTGATCAQC ICTGGTGTQA GGACTTCICA 4140 

GACAAAGGCA ACGCCTOOTQ CCCCTGCCCA GCCCATTCAT aGAGCGCTGG GCCTTCTTGa 4200 

CTTCCATAiSA TCCTAAGCTC TTGACTGTAG TTIASCCA6A CTTGTTTTGC TATCTTATAA 4260 

GC2^GITCAGA ATTAQGOAAT GCTGGTTTTG AAGAGCAAAG GACAGGIAGT CTAOAGAGGG 4320 

TOSTCTGGCX: TGCTT6CTG6 OTCTTTGTAA CCCAGCACTT OCTCTTeaCSOC TCCTGGCTTT 43 BO 

ATGTTTATGG GAlBniGGACrC AATAGCTOCA CCCCrTTCTGG CAGCAGATGO OTCTTGGTTA 4440 

OTTTOCAATA AGCACCTTGC AGAC3GTTAAA GCCafiC5GQGT CCCTftOTCTT ASGCCCAGOC 45 OD 

TGCTTGTGTG GQCTOTGGCC TGGCCTGGTG GCTGGOCCAO GGGGCRGC3«J TGCTTA13AGG 4560 

TTCTGCAGG6 CTTCTCTTGT TTACZACAOCT GCATCAGACA ATOCCaTTTC TCCCCACCAC 4620 

GGPOVCCJnXSC ATCTAA6ATT TCITGCAGGG AATOCXAflCA AXCAGGCAOC ACCCAQCZGX 4680 

GGGGGC2UST6 GGGTGGGQQA GACCCACSUET GATGftCTTTT TTTTTTTCTT TTAAT6AA6A 4740 

AACACCAAAS AAAGCTGTGG AAAGGACCTG CCX3CACATGA AAAGGATAAG CC2AAGA1GGC 4600 

TGTAAACACA GAGCATTTtSA GCTQOCZAiCTC tTTGGflGCACA TTOATTTTTC AARftGCCAGC 4BfiO 

TCTOTCAGQA AAGGAGGIXSC IXvTTATGACA GCTCTTCCAG TGGGCAAAOA GOACGCOCAT 4920 

AaarxTCTTGC ATTGCTAOCT CATCTQTOGS ACCAATTIGO TGTAatOCAAC CTGTGGOCTG 4980 

csAcrrGToac ctcdgraggiaa gcacaaaccc tocatccact toccatttcc tctgcccttt 5040 

TOCACCrOCC CCTTCCATCC CSUX3U3CK3C CAGTGGCTCC CAOAAASCCT TATTGAGCCC 5100 

CTTOTTOACA CITGGGGCTG OGGAGGOCTT TCCCTACTGG TCTGGCCTIT C3CnGAGAGGC 5160 

AGGTCITCOG TOCTCAGAGC CTTTCTGGAA CAAOGAGAAT GCCTGTGCAG GTGGACACAC S220 

AGGCcroacc TGraacrcrc acixgxcttc CAGCGGoaaG cttgaogttg cooagtsgaa 5280 

GAACXATGAC CTCCACTT6C TTOCAASQTG CTAGGGAnGT TTC3U3GGIAC 6CTGGTXCCC 5340 

CTCTCCAOCT QOAGGOCJGAG TTTCTGGGGA CTGC3U3ATTT TTCTACTCPG TGATCaSATTC 5400 

AATGCCCGAT GCTTCTGTTT CATTCCCGAC OCTTT C TACT ATQCAlTTTC CTTTTATCAQ 5460 

GTOTATAZUWG TEAAATACTG TGTATTTATC ACTAAAAAGT ACATGAACIT AAGAGACAAC 5520 

TAAGOCTTTC OTOTTITTCC ACAGGXGTTT AAGCTTCTCT GTACftGTTGA AATAAACAiQA 5580 

CnSCAAAAarG GIAAAAAAAA AAAAAAAAAA a 5^11 



Seq ID HO: CI 93 DKA geq[ueiice 
nucleic Acad Accession #$ liIM_0ia646 
Godlngr se^encei 317.. 2394 



1 11 21 31 41 51 

i I I I I I 

QCXCTGCCAA QTGTAACAAA CICACAGOCC TCTCC^UUUZT GGCTGGGGCT GtTCQGGAGAC 60 

TCXXaUUSQftA CraGTCaGGA AOGCAGGAGA CAGGAdGACGQ OAGCTCIACA GOSAGAOSGT 120 

GOaCGGlSOCC VXGGGGGG6C TGATGTGGGC OWkGBCTGA GTCCOGTCAa GOTCTGGCCT 180 

OGGCCTCAGG OCCCJCAAGOA GCCQQCCCrA CACCCCATGQ QTTTGTCRCT QCXCAAQQAQ 240 

AAAGGGCTAA TTCTCTGCCT ATGGAGCAAG TTCTGCAGAT GOTTOCAQRS ACGGGAGTCC 30O 

TGQGCCCSkGA GCCGAGATGA GCftGAACCTG CTGCAGCAGA AGAOGATCTO GGAOICZCCT 360 

CIOCTTCTAG CTGCCAAAGA TAKTOAIOIC CSkGGOOCTGA ACAAGTTGCI' CAAGTATGAG 420 

CATTGCAAGG TQCSkCCAGAG A06AGO3KXG GTOGAAAlCAlS GSCTACACAT MCAGCCCTC 480 

TATGACAAC5C TGGAGGCOGC CATGTOSCTQ AIGSAGGCIG CCCCGGAOCT GGTCTTTGAG 540 

CCCATGACAT CTOAGCTCTA TGAOOGTCAG ACTOCACTQC ACATOGCTGT TGTGAACCaifi 600 

AACATGAACC TGGIGGGAGC CCTGCTTGC3C OQC3U3GGGCA GTGTCTCTGC CAGAGOCACA 660 

GGCACTQGCI TCCGCXSGTAG TCOCTGCAAC CTCATCIACT TTGGGGAGCA OOCTTOTTCC 720 

TTTGGTGCCT GTGfGAACAB TGMBSAOATC GrQCSGCTGC TCATTQAGCA TGGAGCTGAC 780 

ATCCSGOGCCC AGGacrC3C3CT GGGfiAACftCA GTGTTACACA TCCTCATCCT CCftGOCCSUU:: S40 

AAAAOCTTTG CCTGCCAGAT GTACSftACCTB TTGCTGTCCT ACGACAQACA rrGGGGACCAC 900 

CTGCAQCCCC TGGACCTCQT GCCCAATCAC CAGGGTCTCA CCCCTTTCAA GCTGGCTQGA 960 

GTGGAGGGTA ACACTSIGAT OTTTCAGCAC CTGATGCAGA AGOGQAAOCA CACCCAGTGG 1020 

ACG^TGGAC CACTQAlCCTC GACTCTCTAT GACCTCACA6 AGATGGACTC CTCAGGGGAT 1060 

GAGCAGTCCC TCCIGGAACT TATCATCAOC AOCAAGAA6C GOGAGGClGG CCAGATOCTG 1140 

OACCAGACGC CQOTOAAGGA GCTGGTGAGC CTCAAGTGGA AGCGGTACZia GCQGCCGTAC 1200 

TTCTGCATGC TGQGTSCCAT ATATCTOCTG TACATCATCT GCTTCACCAT GTGCTGCATC 1260 

TACCGCCCCC TCAAGCCX:AG GACCAATAAC CGCACGAGOC CCOC5C3QAJCAA CAOOCTCTTA 1320 

CftGCAGAAGC TACTTCAQOA AGCCTACATG ACCCCTAAGG ACXSAXATCOG GCTGQTCOQa 13 BO 

GAGCTGGTGA CIGTCATTGG GGCTATCATC AT0CTGC1X3G TACAGOTTCC AGACATCTTC 1440 

A6AATGG6GG TCACICGCTT CTTTOGACAG ACCATCXSfTG GGGGOCCATT CCfO&ICXXC 1500 

1321 



wo 03/042661 



PCT/US02/36810 



ATCharCAGCT ATOCCETCftT GSTOCTGTOS ACCATaGTOA TGOGGCTCftT CAGTGGCAGC 1560 

TACQCATGTC CTTTSGA.CTC GrGCTGOGCT GGTGCAAOGT CATGTACTTC 1€20 

t3CCOG?W3GAT TCCAGATGCT W3GCCCCTTC ACCATCATGA. TTCAGAftGRT GATTTTTGGC 1690 

GA-CCTQATGC GATTCTGCTG QCTQArGGCT GTGGICATCC TGGGCTTTGC TTCRSOCTTC 1740 

5 TATATCATCT TCCAGACAOA GGACCCCGAG GAGCTAGOCC ACTTCTACGA CTACCCCATO 1800 

GOCCTGTTCA GCACCTTCX3A GCTOTXCCTT ACCATCAT06 A!EG6CCCftGC CMCTACAAC 186 D 

GTGGACCTGC CCTTCATGTA CAGCATCACC TATGCTOCCT TTGCCATCAT CGCCACACTG 1920 

CTCATGCTCR ACXTTCCTCAT TGCCATQATG GGCOACACTC ACTGGCGA6T GGCCCATOAG 19 BO 

CGQGATGW3C TGTGGAGGGC CCAGATTGTQ GOCACCACGG TGATGCTGGA GCGGAAGCTG 2040 

10 CCTC3GCTGCC TGTGGCCTCG CTCCOGOATC TGOGGAOGOQ AGTATOGCCT GQGRGACCGC 2100 

TG6TTCCTGC OOG'IGGftAGA CAOGCAASAT CTCAACOGGC AGGGGATCCA AOOCTAOSCA 2160 

CAGQOCTTOC ACAQCOGCSGG CTCTOAGGAT TTGGACAAAO ACTCAGTGGA AAAACTA6AG 2220 

CTGGGCrGTC CCrTOWaOCC CCACCTGTCC CTTCCTATOC CCTCAGTGTC TOaRAGTACC 2280 

TCCOSCAGCA OTQCCAATTG QGAAAGGCTT CGGCAAGGOA CCETQftGGAG AGACCTGCGT 2340 

IS GGGATAATCA ACASGGGTCT GGAGGACGGQ GAGAGCTGGG AATATCAGAT CTQACTGCGT 2400 

GTTCTCACTT CGCTTCCTOQ AACTTGCTCT GATTTTCCTG GGTGGATCAA ACAAAACAAA 2460 

AAiOCAAACAC CCAGAGGTCT CATCTCCCAQ GCCCCAGGGG AGAAAGAGGA GTAGCATGAA 2520 

CGCC3UM3GAA TGTAOGTTGA 6AATCACIGC TCCftGGCCTG CATTACIOCT TCAGCTCTGG 2580 

GGCAGAGGAA GCCCAGGCXZA AGCACXW3GGC TGGCABGGC5G TGftGGAACTC TCCTGTQGGG 2640 

20 TGCTCATCAC CCTTCX^GACA GGAGCACTGC ATGTCAOAGC ACTTTAAAAA CAGGGCAGCC 2700 

TGCXTG0GC6 CTGGGTCTCC ACCOCACTGT CATAAGTGGG GAGftGAGCOC TTCCCAGGGC 276D 

ACCCAGGC3U3 OTGCSVGGOAA GXGCAGASCT TGTGGAAAGC OTGTGAGtGA GGGAQACAGQ 2B2D 

AAiCGGCTCTG GGGGTGGGAA QrOGGGCTAS GTCrtGCCAA CTOCATCTTC AAIAAAGIXSG 2880 

TTTTGOSATC CCTGAAAAAA AAAAAAAAAA AAAAAAAA 2918 

25 

seq ID hOt C194 VSA soQuence 

SlUcIelc Acid Accession tts liM_021910-l 

Oodiug sequence; 260- -601 

30 1 11 21 31 41 51 

1111)1 

GTTCTCCACA ACTGCCAGCA ATCCTTCCAC CAGGCAAAAC ACAXCATCTA 2U3GhAAAGAA 60 

GTGAGGTTTG CTTAGGGOGT OQCAGCTTCG GATAAACOCA GGACICCGCC TGGCAGCCCG 120 

ATTTCTCCOG GAACCTCTGC TCRQCCTGGT GAACCACACA GGOCCOAQTT TCPiCCCMSTCC 180 

35 CCGACTCCAC GGTGCAGC^CG OGGCTTATCT CTCAGCCCAG GGAGATGCCA GCCTTOCTOT 240 

CC0GGG0CA6 CGCTCTQAC?^ TGCAISA2USGT GACCCIGGGC CTQCTTOTGr rrCCIGGCAGG 3Q0 

CTTTCCTGTC CTGGACGCCA ATGACCTAOA AGATAAAAAC ASTCCTTTCT ACTAIGACTQ 360 

GCACfiGCCTC CAGGTTGGOG GaCTCATCTG OGCTGGGGTT CTOTGCGCXa. TGGGCATCAT 420 

CATCGTCATG AGTGAGTQGA GGAGCTCGGG QGAGCAGGC2G GGGOGGGGCT GGGGCICCGC 480 

40 TCCCCXGACC ACTCAGCICT CCXICAACAGG TGCAAAATGC AAATGCAAGT rrTGGOCAGAA 540 

GTCCGGTChC CATOCAGGGG AGACTOCACC TCTCAlTChCC COCGGCTCAG CCCAAAGCTB 600 

ATBAGQAGAG AOCAGCTGAA ATTGSSrGGA GGACCGTTCT CTQTCCOCAG iSTOCTGTCTC 660 

TGCACAGAAA CTTGAACTCC AGGATQGAAT TCTTCCTCCT CPGCTGGGAC TCCXTTQCAT V20 

GGCROGGCCr CATCTCACCT CICXSCAAGAG QGTCTCTTTG TTC31ATTTTT TTTAATCTAA 780 

45 AATaATTGTG CCTCIGOOCA AGCAGOCTGG AGACrTTCCTA ISXGTGCATT 06GGTGGGGC S40 

TTOGGGCftCC ATGAGAAGGT TGGCaGTGOCC TGGAGGCIGA CKCAQAGGCr GGCACTGIUGG 900 

CTGCITGTXG GG?MASGCC ACAGGCCTGT TCXSCTTOTGS CITGGGACAT GGCACAGQCC 960 

OGCCCTCTGC CTCXTTCAGCC ATGGGACCIX: ATATGCAATT TSOGATTTAC TAGTAGCCAA 1020 

AAGGAATGAA AGT^GAGCTCT AACCAGATGG AACAClGGAA CATTOCAGTG GACCCTGQAC 1080 

50 CATTCCAOGA AAACIGGQAC ATWSGATOGT CCCGCTATQA TGQARGOSTT CAGRCRGTTT 1140 

ATAKTAST2U^ GOCGCTGTGA CCCTCTCACT TACCCGG9\GA CCTGACTTTA TTACAAGAXG 1200 

77TTOCAAATA CCCAAATATC CCTGCAAGOC CGTTAAATAA TTCOCTATGC TAGCCTTAAT 1260 

AACATACAAT GAOCACATAG TGrOAGAJlCrr TCCAACAAGC CTCAAAGTCC CTTOAGACTC 1320 

CCXaATACCT AATAAGGCRT QOGAAATGTT CTCATGRACX ACCCX3«»AC ACGOCTAAAA 1380 

55 CTCAAAACAG CC3VAAAATAT CTCCTCCAAT GTCCTGAQAC ATQAAOCCAA AAAGAGACCC 1440 

ACAATAftACX CSEGACTTGI CCOCTC 1466 

Seq ID NO: C195 SOXA Sequence 
]9ucl«ic Acid AcceBBlDil aM^005971.2 
oO Oodlng segyence^ 276.. 43 9 



1 11 21 31 41 SI 

I I I ] 1 I 

GTTCTC3CACA ACTGCCAQCSi ATCCTTOCAC CftGGCAAAAC AjCATCATCTA AGGAAAAGAA 60 

65 GTGMSGITTG CTTAGGGOGT GGCAGCTTOS GATAAAGGCA GSACTOCGCSC TGOCAGCCXS 120 

' ATXTCTGCSaG jGAACCICXGC TCAGCXTTOGT GAACCACACA GGCCAOOGCT CTGACATOCA 180 

GAAQGrGACC C3GGGGCIGC TTOTQITCCT GGCAOGCTTT GCTGTCCTGQ ACGCCAATGA 240 

CCTAGAAGAT AAAAACAGTC CTTTCTACTA TGACTGGCAC AGOCTCCAGG TTOGCGGGGT 300 

CATCTGOGCT GOGGTTCTGT GOGCCATQOa CATCATCATC GTCATOftOrQ CAA2UmSC3iA 360 

70 ATGCSAGTTT GGCCn9AA<3T CXXaGTCAiCCA TCCAGGGGAlG ACTCCACCTC TCATCAOOOC 420 

AGQCrCZUSGC CftAZU3CIG»r GAGGACZ^GAC CAGCIGAAAT TGGGTQSAGS ACOGTTCTCT 480 

GTCOCCAGOT CCTGTCTCZG CACAGAAACT TGAACTCCAG GK£&SbK£TC TICCTCCTCT 540 

GCTGGGACTC CTTTGCATOG CAGGGCCTCA TCTCACCTCT 06CAAGAGGG TCTCTTTGTT 600 

^ - CAATTTTTTT TAATCTAAAA TOATTGTGCC TCTGOCCAAG CAGCCTGGAG ACTTCCTATG 660 

/5 TGTGCATTGG GGTGGGGCTT GGGGCACCAT OAGAAGGTTG GOGTGCCCTG GAGGCTGACA 720 

CAGAGQCT0B CACTGAGOCT GCZTGTTGGG AAAAGCSGCAC AGGOCIGTTC CCTT6TGGCT 780 

TOGGACATGG CACM36CCGG COCTCTOCCT CCXCAGOCAT GGGACCTCAT ATGCAATTTG 840 

GGATTTACTA GTAGCCRAAA GGAATGAAAG AGAGCTCTAA CCaGATGCSAA CACTGGAACA 900 

TTCCAGTGGA CCCrrGGACCA TTOCAGGAAA ACTGGGACAT AGGATOGTOC CGCTATGATG 960 

80 G24A0rGTTCA GACftGTTTAT AATA6TAAGC CCCTGTGACC CTCTCACTTA CCtXXaAQACC 1020 

TCACTTTAST ACSkAGATCTT TCCAAATAOC CAAATATCCX: TGCAAGOCXXS TTAAATAATT 1080 

CCCXATGCTA OOCTTAAXAA CATACAATGA OC^^CATAGTG TGAOAACTTC CAACAAGCCT 1140 

CAAAGTOGCT •EGAGACTCGC CAATAGCTAA TAAGGCATGC GAAAT6TTCI CATGAACTAC 1200 

GCCACAACAC GOCXAAAACr CAAAACAOCC AAAAATATCT CCTOCAATGT CCIOnfiACAT 1260 
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eeq ID HOi C196 DWA Sequence 

Nucleic Acid RcceBsion ft: HK_004961.2 

Coding fseguence: SS.-isvs 

1 IX 21 3i 41 51 

\ \ \ \ \ \ 

GCCAGAOCGT GAGCCGCGAC CTCOSfCGCAG GTGGTCGCGC OGGTCTCCHC GGRARTGTTG 
TCCAAAGTTC TTCCAGTCCT CCTAGGCATC TTATTGATCC TCCftffTCGAG GGTCGAGtSOA 
CCTCAGACra AATCAAAiQAA TGAAGCClCT ICCCGTGKTG TTGTCTAXOQ CCSXSMSCCC 
CAGCCTCTGG AAAATCAGCT CCTCTCTGAG GAAACAAAGT CAACTGAGAC TGRGACTGGG 
AGCAC3AOTTO GCARACTGCC RQRAGCCTCT CGCATCCTGA ACACTATCCT QAOTAATTAT 
GACCACAAAC TGCGOCCTGG CATTGGAGAG AAGCCCACTQ TGGTCACrGT TGAGATOGCC 
OTCAACAiaCC TTeerCCTCT CTCTArCCTA 6ACATGGAAX ACACCATTQA CATCATCTTC 
TCGCAGACd GGTAGGACGA AOGOCFCTGT TACAAOaACA CCXXieAGTC TCTTC3TTCTG 
AATGSCftASG TGGTOAGCCA OCTAlieOAlX: CCGQACftCCT TTTTTAGOAA TTCTAAGACSG 
ACCCT^XXSAGC ATGAGATCAC CATGCCCMJC CAORTGOTCC GCATCTACAA GGATOGCAAQ 
QTGTTOTACA CAATTAGQAT GACCATTGAT GCOGGATGCT CACTCEACAT GCTCAC3ATTT 
CCAAIGGATT CTCACTGTTG GOCTCTATCT TTCTCTAOCT TTTCCTATCC TGAGAATGAG 
ATBATCTACA AOTOaOAAAA TTTGAAGCIT 6AAATCAAT6 AGAAlQAACTC CTBQAACSCTC 
TTGCA6TTT6 ATTTTACSAGG AGT6AGCAAC AAAACTQAAA TAATGACAAC CCCAGTTOGT 
QACTTCATGQ TCATQAC&AT TTTCTTCAAT GTGAGCAOGC GGTTTOGCTA TOTTGCCTTT 
CAAAACTATC TCCCTTCTTC C&TGACXawzo ATGCTCtCCT GG(3TTTCCTT TTOGATCAAQ 
ACAQASTCT6 CTCCAfiOGCG GACCTCTCTA GGGATCAOCT CE<irrCXGAC CATGAGCACG 
TTGGGCACCr TTTCTGOTAA GAATTTCCSCfi CXjfS&CCCCCX AIATCACAGC CTTOQATTTC 
TATATCGCCA TCTGCTTCGT CTTCTGCTTC TQCGCTCTQT TGGftGXTTGC TGTGCTCAAC 
TTCCTGATCT ACAAOCAGtAC AAAACCCCAT GCTTCTC5CTA AACTCCOCCA TCCTOSTAarC 
AATAGCCXSTS OCCATGCX30G TACCCOTQCA aiTTCCOGAfi CCTGTGCCC3G CX»ACATCAO 
OAAQCTTTTG TGTGCCAGAT TGTCACCACT GAGGQAAQTQ ATOCSAGAGGA GOGCCOGTCT 
TGCTCAGCOC AOCIU3CCCCC TRGCXX3U3GT AGCXXTTGAGG GTOCOCBCAa CCTCTaCTCC 
AAGCTGGCCC GCTGTGASTG GTGCAAGOGT TTTAAaAAST ACTTCTGCAT XSGTCCCCGAT 
TGTGAGGGCA GTACCTGGCA GC3VGC3aCCOC CTCTGC5\TCC ATGTCTACCG CCTOGATAAC 
TACTCQAGftS TTGTTTTCXX: AGTGACTTTC TTCTTCTTCA ATGTGClCTA CTGGCTTGTT 
TGOCTTAACT TQTAGGTACC AOCTGGTACC CTGTGGOGCA ACCTCICCAO TTCC3CCAGGA 
GQTCCAAGOC: CdTGCCAl^ GGAGTTGGGG GAAAQC»GCA GCAfiCAiGCftG GAG0GACTA6 
A0TT7TTCCT GCOCCATTGC CCAAACAGAA GCtTGCA8A6 GGTTTGTCTT TQCTGCCCCT 
CTCCCCTACC TGGCCCATTC ACTGAOTCTT CTC3VSCRGAC C9VTTVCAAAT TATTAATAAA 
TGGGCCAC5CT CCCTCTTCTT CftAaOAGCAT CCGTGATGCT CRGTGTTCAA AACCACAlGOC 
ACTTA6IGAT CAGCICCCTA AAAOCATGCC TAAOTACAGQ C3GiGATTAGCT A^TCTTCCAAC 
AATGCTOAiCX: AOAGACAAT TAClGCATCX TTOCAGAAGC GCACn^TQC CTTTQTAGTG 
CTTTa3GCCC AGTTCIGGGC TGAGCCTCAA AGTQCftCC3GA CTAGrrGCTT GOCIATACCT 
GQCACCTCAT TAAOATGCTG GGCAGCftGTA TAACAGGAG6 AAGAGATCOC TCTCCTTTGG 
TCAGATTATT ATGTTCTCAG TTCTCTCTCC CTGCPACXX^C TTTCtCItSCA GATAGATAGA 
CACTGGCATT ATCCCTTTAG GAAf3AC30C3GG Q<3GCftGCARG AGAGCCTATT TGGGACACKIA. 
TTCCSCICrC TCXGCTGCIG TGKXICSCC CTCTGCrroC TOQCTCCATC TCTOGTCTGC 
ACZAOCIWIT GAATOGGCIT CATGC3UVIGG GTASCTMITT TTGTGTGTOA TTATAGTAAC 
TACTCCCTGC TTTATATGCC AOCCTCTTOC TTCTCTTTQR CCOCXGTGAC TCTTTCTGTA 
ACTTTCCCAG TQACTTCCCC TAGCXXTTGAC OCAGGCACTA GGCCTTOGTQ ACTTCCrGGG 
GOCAAGAAAC TAAGGAAACT CGGCrTTQCA ACAOGCATTA CTCX30CATTG ATTG6T600C 
AlCCCAGGGCA CACTGXCGGA GTTCfATCAG TTGCTTSACC CCTGGACCCA TAAACCAGTC 
CACTSTTATA CCOGGGGGAC TCTAACCATC ACAATCAATC AATCAAATTC CCTTAAATTT 
GWVMSCACr GGAACTTTGG CRAAGCACTT TTGACAAGTT OTOTCTGRTX GGAGCTTCAT 
GATAGCCTTO TGAC3VTCTTT AOaOCAGGAT TCtTATCOGC ATTTTGCaiGA TOAAAAOCCT 
GAGTCACAGA TTTCTGTGGG ACTGTGGATC TCACTGOAAG CCATOCAAGA GCCCACTGTC 
ACCTTCZI^QA CCACATGATA GOGCTAGACA GCTCAGTTCA GOVTGATTCT CXUCSi&tCliC 
CXCTGCIGSC ACACCAGTGG CAAGGCCCAQ AATOGOOAlX TCECTTTAGC TCAATTTCTG 
GGCCTGAGOT C3CTC5W3ACrO OCS2CCAAGAT CAAATCTCTC CTGGCTGTA3 TAACOCWStG 
GAATGAATTT QGACATGCCC CAATGCTTCT ATATGCTAAG TGAAATCPGr GTCTGTAATT 
T6TTGGOGGG TGGATAGGGT GGGGTCTCXT^ TCTACTTTTT OTCACCATCA TCTGAAATGG 
GGAAATATOr AAATAAATAT ATCAGCAAA6 CAAAAAGAAA AAAAAAAA 

Geq ID NOb C197 DNTA Secpience 

nucleic Acid Acceesion #s IIM_I>21984.1 

Ooding sequence: S72.,17S3 



1 
I 

GCCAGAGC3GX 
TCCAAAGTTC 
CAGAGAAOTG 
GTGTAAAGAA 
CACTGCCTCC 
TCAGACTOAA 
QCCTCTGGAA 
CAGAGTTQ3C 
OCWaAACTG 
CAACAGCCXT 
CCAGACCTGG 
TGQCAATGTG 
CCACGAECAT 
QTTGTACACA 
AATGGArrCT 
GATGTACftAG 



11 
I 

GAGOOGGGAC 
TTCCAGTCCT 
CCCAAATCAT 
AGOCAAATCA 
CAGCAAAGGC 
TCAAAGAATG 
AATCAQCTGC 
AAACTGCCAG 
(3GCCCTGGCA 
GGTCCTCTCT 
TAGGAOGAAC 
QTGAGCCAGC 
GAGATCACCA 
ATTAGGATOA 
CACTCTTGCC 
TGGOAAAATT 



21 

I 

CTOCGCGCAG 
CCTAGGCATC 
AAGTGTAGAG 
AGGAOOCGAA 
AGCAOTATCC 
AAGCCTCTTC 
TCTCTSAGGA 
AAGCCTCTCG 
TTGGAGAGAA 
CTATCCTAGA 
GOCTCTGTTA 
TATOGATOCC 
TGCCCAACCA 
CXZATTGATGC 
CTCTATCTTT 
TCAAGCTTOA 



31 
I 

GITGOTOQCGC 
TTATTGATCC 
CTGATGAGTT 
TGTGA0CAG6 
GGAXjrrCTAA 
CC6TGATGTT 
AACAAAGrrCA 
CATCCIC3AAC 
GCCCACTGTG 
CATGGAAIAC 
CAACGACAiCC 
GGACACCTTT 
GATGGrcCGC 
OGGATGCTCA 
CTCTAGCXTT 
AATC31ATQAO 



41 

I 

CGGXCTCGGC 
TCCAGTOGAG 
OrCAAAAAAT 
AOrrCAGAAS 
CAGCA7CGGG 
GTCTATGGCC 
AC3X3AGACTG 
ACTATCCTOA 

gtcactgttg 
accattqaga 

TTTGAISTCTC 
TTTAOGAATT 
ATCTACAAGG 
CTCCACATGC 
TCCIATCCro 
AASAACICCT 



60 

120 

180 

240 

300 

3£0 

420 

480 

S40 

fioo 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

12 60 

1320 

1380 

1440 

1500 * 

15CQ 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

252D 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3168 



51 
I 

GGAAATGTTG 
AACATGTTATA 
GACCACASCG 

TOGAGGGACC 
OCCAGCOOCA 
AGACTG6GA6 
OTAATTATQA 
AGATCTCCGT 
TCATCITCTC 
TTGTTCTGAA 
CTAAGAGGAC 
ATGGCAAGOT 
TCAGAITTCC 
AjQAATQAGAT 
G6AAGCTCTT 



60 

120 

180 

240 

300 

360 

420 

480 

S40 

60 0 

660 

720 

780 

840 

900 

960 
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CC3U3TTGGAT TTTACAGGAG T6AGCAACAA AACXGAAATA ATCACAACCC CAGTTGGTGJV 102 D 

CTTCaaXSGTC AT6ACGATTT TCTTCAATGT GAGCAGGCGG TTTOQCCATG TTQCCTTTCA 1080 

AAACTATCTC C CT T C TTCCG TGACXT^OOAT GCTCTCCTGG GTTTCCTTTT GCSATCAROAC 1140 

AGAGTCTGCT CCAGCCCGOA CCTCTCTAGO GATCACCTCT GTTCTGACCA TOACCACGTT 120 D 

5 GGGCACCTTT TCTCGTAAGA ATTTCICCGOS rGTCTCCTAT ATCACAiQCCT TGGATTTCTA 1260 

TATGGCCATC TGCTTOGTCT TCTGCrTCTO OGCXCTGTTG 6AGTTTGCTG TOCTCMCrr 1320 

CCTI3ATCTAC AACCftGACAA AAGCCCATGC TTCTCCTAAA CTCCX30CATC CTOGTATCRA 1380 

TAGCCQTOCX: CATQCCaSTA CCCGTGCACG TTCCXZGfiSCC TQTGCCCGCC AACATCAOOA 1440 

^ AGCrTTTGTG TGCXZAGATTG TCACCACTOA StSGAAGTGAT GGAGAGGAGC GCCCGTCTTG 1500 

10 CTCAGCCCAQ CAlSCCCCCTA QCCCAGGTAG OCCTGRGGGT CCCCQCAGCC TCTGCTCXAA 1560 

6CTG6CCXGC TSIGAGIGGT 6CAAGOGTTT TAAGaAGl!2^ TTCTGCATGG TCCCCGATCG 162 D 

TGAGGQCAGT AGCTGGCAGC AGGCCCX3CCI CTGCATCCAT GTCTAOCGCC TG6ATAACTA Z680 

CTCGAGAGTT GTTTTCCCAG TGACTTTCTT CTTCTTCAAT GTGCTCTACT GGCTTGTTTG 1740 

ccrrAAcrra taqqta(3Cag ctggtacgct gtggggcaac ctctccaqtt ccocftCGAGG laoo 

13 TCCAAGCCCX: TTGOCAAOGG AGTTGGCSaQA AAGCAGOUSC AGCAGCAGGA GCQACTiUGAG 1B£0 

TTTTTCCTGC CCCIVITCCCC AAACA6AA6C TTGCAGA0GG TTTGTCTTT6 CXGCCCCICt 1920 

CCCGIACCTG GCCCATTCAC TGAGTTTTCT CAGCAGACCA TTTCAAATTA TTAATAAATO 19B0 

QGCCaCCTCC CTCTTCETCA AfiGAGCATCC GTGATGCTCA QTGTTCSIAAA CCACAGCCAC 2040 

TTAGTGATCA GCTCXCTAAA ACCATGCCTA ASTACS^QCSOG GATTAGCTAT CTTCOVRCRA 2100 

20 TGCTGACCAC CAGACSU^TTA CTGCSiTTTTT CCA6AAGCGC ACTATTQGCZ TTGCAGTGCT 2160 

TTGGGCCCA6 TICTGGCCTC AGOCTCAAAG TaCAGCQACT AGTIGCITGC CTATACCTOQ 2220 

CACCTCATTA AOATGCTOGG CAGCAGTATA ACAG6AGQAA GA6ATCCCTC TCCTTTGGTC 22 BO 

A6ATTATTAT GTTCTCAGTT CTCTCTCGCT GCTACCOCTT TCTCTGCAGA TAGATAOACA 2340 

CTQGCATTAT CCCTTTAGGA AGRGGGGGGG GCAGCAAQAG ACSCCTATTTG GGACAGCATT 2400 

25 CCTCTCTCTC TQCTGCTGTG ACATCTCCCT CTCCTTGCTG GCTCCATCTT rGGTCrGCAG 2460 

TACCAATTGA A1X3C0CTTCA TCGAATGGGT ATCTATTTZT (3T6TGTGATI ATAGTAACTA 2520 

CrCCCTGCTT TATATGCCM CCTCTTOCTX CTGTTTGACC CCTGTOACTC TTTCIGTAAC 2580 

TTTCCCAGTG ACTTCCCCTA GOCCTGACCC AGGCACTAjOG CCTTGGTGAC TTCCTGOGQC 2640 

CAAGAAACTA AGGAAACTCG GCTTTGCAAC AGGCATTACT CGCCATTGAT TGGTGOCCAC 2700 

30 CCAGGGCACA CTGTCQGAGT TCTATCXiZTT GCITGAGOCX: TGGAOOCATA AACCAGTCCA 2760 

CTQTTATACC CGGGGCAJCTC 13UVOCATCAC AATC3ATCAA TCAAA-TTGGC TTAAATTTGT 2820 

ATGGCACTGG AACTTTGGCA AAGCACTTTT GACAA6TTGT GTCTGATTGB JUSCTTCA^EGA 2800 

TAGCCTTGTG ACATCTTTAG GGCAGGATTC TTATCCOCAT ITTQCMATG AAAACCCTGA 2940 

GTCACAGATT TCTGTGGGAC TGTGGATCTC AClGGAAGCT ATCCAAGAGC CCaU^TQTCAC 3000 

35 CTTCTAOACC ACATQATAGG GCTTAGACAGC TCAGTTCACC ATOATTCTrCT TCTGTCAOCT ^060 

CIGCTGGCftC AGGA6TOGCA AGGOCCAC3AA TGGCGACCTC TCTTTAGCn: AATTTCTGGa 3120 

CCTOACKXraC TC3U3AC^£GCC CCCAAGATCA AATCTCTCCrr OGCnXTAGIA AGCCAGTGGA 3180 

ATGAATTTGG ACA3K3000CA ATGCTTCTTAT ATGCTAAGTG AAATCTGT6T CTOIAATTTG 3240 

TTGGGOGGTG GATAGGGTGG GGTCTCCATC TACTTTTTGT CACCAXCftTC TGAAATGGG6 3300 

40 AAATATGTAA ATAAATATAT CAGCAAAQC 3329 



£toq ID nOi CL98 SNA Sequence 

Nucleic Acid AecesBioa #s 1IK_021987.1 

CSodlng sequences 572.. 1657 

45 

1 XX 21 31 41 51 

I I I i i ! 

GGCAGAfiCGT GAGCOGOGAC CTCOGOGCAG GTGGTOGCGC: C9GGTCTOC3GC OGAAATQTTG 60 

TCCAAAGTTC TTOC3V3TCCT CCTAGGCATC TTATTGATCC TCCAGTOaAG i^ACAIGrATA 120 

SO CAGAGAAGTG CTCAAATCAT AAGTGTACAG CTGATGAGXI GTCAAAAAAT GACCACAGC3B 180 

GTGTAAASAA AGGCAAATCA AOGAOCOSAA TGIGAGCAGQ AOCTCAOAAG OGCGCrTTGr 240 

CACTGCCIGC CW3CAAABGC AGCACTATQC GGACfTCTAA CAOCATOGGG TOOAGGaACC 300 

TCaSACIGAA TCAAAGAATG AAGCGTCTTC CCOTGATGTT GTCTASTGOCC CCCAGCCCCR 360 

GCCTCTGGAA AATC3\GCTCC TCTCTGRGGA AACAAAGTCA ACTQAGACTG AGACTGGGAG 420 

55 GAfSAGTTGGC AAACIGCGAG AAGCCTCTGS OLTOCIGAAC ACTAICCIGA GTAATTATGA 480 

CCftCAAACTG OaCQClGGgi nSGAGAGAA 6GGCACI6TG GTCACTGTTG AGATCTGGGT 540 

C3^ACAGC9CTT GGTGCTCTCT CTATCCTAQA CATGQAATAiC ACCKtCGUCA TCATCTTCIC 6QQ 

CCAGACCTGG AATTCTAAQA OGACCCAOQA GCA'IGAGATC AOCATGCCCA ACCAGA'OXSGT 660 

C06CATCTAC AAGGATGGCA AGGTaTTGTA CACAATTAGG A3K5ACCATTG ATGCCGGATG 720 

60 CTCACTCCAC AXGCXCAGAT TTOCAATGGA TTCTC3VCTCT TGCCCTCaAT CTTTCTCTAG 790 

CTTTTOCTAT OCTGAOAATG AGATGATCTA CAAGlTGGQAA AATTTCAAGC TTGAAATCAA 840 

TGAGAAGAAC TOC1X3GAA6C TCTTCCAGTT TGATTTTACA GGAGl'GAGCA ACAAAACT6A 900 

AATAATCACA ACX3CCAGTTQ GTGACrTCAT GGTCATGA06 ATPTTCTTCA ArarOAGCRS 960 

GOGGTTTGGC TATGTTGCCT TTCAAAACTA TGTCCCTTCT TCGGTGACCA CGATGCTCTC 1020 

65 CTGC3GTTTOC TTTTOGATCA AGACAGAGTC TGCICCAGCC CGGACXTFCTC OSVGGGATCAC 1080 

dCTGTTCTG ACCATGAOCA GBTTGQOCShC CTTTTClOGl! AAGAATTTCC OGOOTGTCTC 1140 

QTATATCSkCA GGCTTOGATT TTCTATATCGC CATCTGCTTC GTCTTCTGCT TCTGGGCTCT 1200 

6TTGGAGTTT GCTGTGCTCA ACTTCX:tOAT CTACAACCAG AGAAAAGOOG ATGCTTCTCC 1260 

TAAACTCCGC CATCCTCG31A TCAATAGCCG TCOCCAT6CC C33XAOCCS3T6 CSiOGTTCXX3G 1320 

70 AGCCIXSTGOC CGOCAAjCA!rC AQGAAOCTTT TGTGTGCCAG ATIGTCACCA CTGAGGGAAG 13 BO 

TGATOSACAG GAGGGCOOGT CTTGCSTCAiQC CCAGCAGOCC CCTAGOCCAO QTAGCCCTGA 1440 

GGGTOGQCX3C AGCCTCIGCT COUUSCIGGG CTGCTGTQAG TGGTGCAAGC GTTTTAAGAA 1500 

GTACITCTGC ATGGTCCCOG ATTGTGAGGG CAGXAGC3?GG CAGCAGGGOC GCCTCTGCAT 1560 

« CCATGTCTAC CGCClGGATA ACTACTCGRG AGTTGTTTTC CCAGTGACTT TCTTCTTCTT 1620 

75 CAATGTGCTC TACTGGCTTG TTTGCSCIXAA CTIGTAOGTA CCAGCTGGTA CCCT6TGGG6 1680 
CAACCTCICC AGTTOCCCAG GAGGTCCAAG CCCCTTGCCA AGGGAGTTGG GGGAAAGCAG 1740 

CAGCAGCAGC AGGAGCGACT AGAGTTTTTC CTGCCCCSVIT GCOCAAACAG AAGCTIGCAG 1800 

AGGOTTTGTC TTTGCTGCOC CTCTCCCCTA CCTGGCC3CAT TCACK3AGTT TTCTCAGCAG I860 
^ ACCATTTCAA ATTAT7AATA AATGGGCCftC CTTCCCTCTTC TTCAAQGIAGC ATOCGXGAIG 1920 
80 CTCAGTQTTC AAAACCACAG CCACTTAGTQ ATCAQCTCOC XAAAAGGATG CCTAAGTAC3V 19B0 
GGCaOATTAG CTATCTTCCA ACAATGCTGA CCACCAGACA ATTACTGCAT TTTTCCAGAA 2040 
GCOCACTATT GCCTTTGCAG TQCTTTOGGC CXSDGTTCTGQ CCXCAGCCTC AAAQTGCACC 2100 
GACTAOTTGC TTGCCTATAC CTGGCACCXC ATTAA6ATGC TGOOCAGCAG TATAACAGGA 2160 
GGAAGS^TC CCTCrCCTTT GGlCAGATTA TTATGTCCTC AGTTCTCXCX CCCIGGXACC 2220 
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CCTTTCTCTG CAGATAGATA GACACTGGCA TTATCCCTTT AGGAAGRGGG GGGGGCAGCA 2280 

AGAGAGCCTA TTTGGGACAG CRTTCCTCTC TCTCTGCTGC TOTOACATCT CCCTCTCCTT Z340 

6CTGQCTCCA TCTTTCOTCT GCACTACCAA. TTCAATGCCC TTCATCCAAT <30GTATCTAT 2400 

TTTTGTGTGT OATTATAGTA ACXACTCCCX GCTTTATATQ CCACXXTXCTT OCTTCTCTTT 2460 

5 eACCCCTOTG ACTCTTTCTQ TRACrrTCXX: AGfTGACTTCC OCTAGCCCTB ACCAOGCftCT 2520 

AGGCCTTGGT GACTTCCTGG GGCCAAQRAA CTAAGQAAAC TOGQCTTTGC AACAGGCATT 2580 

ACTCGCCATT GATTGGraCC CACXXMGGC ACACTGTCGG AGTTCTATCA CTTGCTTGAC 2640 

CCCTGGRCOC ATAAACCAGT CCACTGTTAT ACCCGGGGCA CTCTAACCAT CSiCAATCftAT 27 00 

CAATCAAATT CCCTTAAATT TGTATGGCAC TGOAACTTTQ QCAAAGCACT TTT(3ACAAGT 2750 

10 TGTQTCTGAT TGGAGCTTCA TGAlAGOCTr GTGACATCTT TAS06CAG6A TTCTTATC3CC 2820 

CATTTTGCRG ATGAAAACCC TGAGTCRCRG ATTTCTGTGG QACTGTGGAT CTCACTGGAA 2880 

GCTATCCAAG AGCCCACTGT CACCTTCTAG ACCAiCRTGAT AGGGCTAGAC AQCTCftOTTC 2940 

ACCATGATTC TCTTCTGTCA CCTCTGCTGG CACACCAGTG GCAftGGCCCA GAATGGOGAG 3000 

CTCTCTTTAG CTCAATTTCT GGGCCTGAQG TGCTCAGACT GCOCXXSUVGA TCAAATCTCT 3060 

IS CCTGGCTGTA GTAAOCCAGT GGAATOAATT TGGACATGCC CCZUVTOCTlTC TAXA!I6CrAA 3120 

GTGAAATCTG TGTCT6TAAT TT6TTGG6GG OTGGATAGGG TGGOGTCTCC ATCTACTTTT 3180 

TGTCACCATC ATCTGAAATG faOQAAATATG TAAATAAATA TATCAGCAAA GC 3292 

Seq in NO* C199 DNA Sequence 
2.0 Nucleic Acld AccessiozL NM_021990.1 
CSodlng sequences 1309. .2490 

1 11 21 31 41 SI 

I 1 1 I i 1 

ZD GCCAGAQCGT OAGCCGCXSAC CTCG6CGCAG GTGGXCGCQC CGGTCTOCGC GGAAATGTTG 60 

TCCAAA6TTC TTCCAGTCCT CXTTAGGCATC TTATTGATOC TCCAGTCGAQ AAiCftTGTATA 120 

CAGAOAAjaTa CTCAAATCAT AAGTGTACAG CTGATGAfJIT GTCAAAAAAT GACCACAGCG 180 

GTGTAAAGAA AGCCAAATCA AGQACCCGAA TGTGAGCAGG ADCTCAQAAG COCCClTTGT 240 

CaCTSCCTCC CAGCAAAOGC AGCACTATCC GGRjCTTCtAA CAOCATOSGT GAGTTTCATA 300 

30 CCITGGGAGA TOGCCTTTAA CRTTTTTGTT TAATTCAATT AirTCTTACTA ATCITCTTCT 360 

TTTTCITGGC TGTGGTGCAT 6QCTGTGGAQ CTCAGGGTGG ACTOCTGTTG GGCAQCJCAOT 42 D 

TCCrGGATGQ CTGTCTGTQQ QTGGRGGACT CCTGCCTTTC CTQTTTAGAC ACXXIACAAAG 480 

GCTQCTCTTT AfiCCTCCrTC CCTTCATCCC CTTCCCXITGC CCXX:WST<3CA AOGAGTATTA 540 

^ CACAACCAAC AAAA<XGCAA AATATTCCCA CAATTTTCTO GTCXITCTCTG G6AGAGGCGG 600 

35 CTCTGGCTTT TCCrrCTCAGC CCTCGCCXTTC TGCCTGCECC TGACTCCTGG TTGGTGCTGa S60 

TCAGGCTGAC TAGRGOtXM. QGOSACCMC ACTAGGCAAA CGOGGCCAGC GCTCAGACAT 720 

AAATGCCCrC TTCATTTCAC GTGTAACAXT CTTTTAARAT CTAGGTCTTG GTXTTGTTGA 780 

TTTTTTCTTA AATAAAAGAG TGATCArCAAA AGAGOGACAG CATAGAAAGT GOOCZUUU3A6 840 

CAGCAAGGTT TTAAAGAAAT TCACA7W3CCT AATCTC3TCAC TGTCTrATAA TTTOCXATTA 900 

40 CCAGTCAC3\A TTTAACTAGQ TTTTGTGTIG AAAACTTGTT TTGQTTTGCT VCT6TGCCAA 950 

GAGGCRCTAG CTGGQGCCCC TACAGAGTGC AGOaCftGAGC TTCATTTTTC GTTraAATGT 1O20 

TCTAGGGTCG AGGGACETTCA GAOTGAATCA 3\AGAATGAAG CCTCTTCCC3G TGA1GTTGIC 1080 

TJOGGCXICCC AGCCCCAGCC TCT6GAAAAT CAGCTCCa:C:T CTGAGGAAAC AAAGTCAACT 1140 

GAGACTGAGA CTGGGAQCAa AGTrGGCAAA CTGGCAjSAAG CSCTCrCJOCAT GCTGAACACT 120 0 

45 ATCXITOAQTA ATTATGACCA CaAACTGCGC CCTGGCATTG GAGAOAAGCC CACTGTGGTC 1260 

ACT6TTGAGA TCTCOGTCAA CAGCCTTGQT CCXCTCTCIA TCCTAGACAT GGAATACACC 1320 

ATTQACSITCA TCTTCTCCCA GAOCTGGTAC GACJGAACGCC TCIGTIACAA OGACAC CITT 1380 

GAGTCTCTTS TTCTOAATOG CAATOaXSSlO AjGQC3«3CTAT GGATCOOaQA CACCTTPrTT 1440 

AGGAATTCTA AGAGGACCCA GGAGCATGAG ATCACCATGC GCAACCAC3KT OOTCOQCKTC 1500 

50 TACAAGGAT9 GCAAGGTGTT GTACACAATT AGGATGAOCA TTGATGCSOSG AIGCICACTC 1560 

CACATGCTCA GATTTCCAAT GGATTCXCAC TCTTOOCCTC TATCTTTCTC TAGCTTTTCC 1620 

TATCCrOAGA ATGAOATGAT CTA£!AAG106 GAAAATTTCA AQCTTOAAAT CAATGAGAAG 1680 

AACTGCXOGA AGCTCITCCA GITTGATTTT ACAGQAGIGA GCAACAAAAC TQAAAXAWTC 1740 

ACAACCC9CAG TTGGTGACTT CATG6TCATG AOBATTTTCT TCAaiGrGAG GAGGGaGTTT 1900 

55 OGCrATCTTG CCTTTCAAAA CTATGTCCCT TCXTCGGTGA CCAOGATI3CT CIQCrGGOlT 1B60 

TCCTTTOXSGA TCAAGACAGA GTCTGCTCCA GCCCGGAOCT CTClTAGGGAT GAGCTCTGTT 1920 

CTGAOCATGA CCAOGTTGGG CACCTTTTCT CGTAAQAATT TCCCGOGrGT CXCCTATATC 1980 

ACnacarrOS ATTTCTATAT GGCCATCTGC TTGGTCTTCT GCXTCSGCGC !CCrSTTGGAG 2040 

TSTGCTGIGC TCAACTTCCX GATCTACAAC CAGACAMAG OCCAT0CZTC TCCXAAACTC 2100 

60 CXSCCATCCTC GTATCAATAG OOGTGCCCAT GOCCGTACCC GTOCAG6TTC COGAGOCTGT 2160 

GCCCGCCAAC ATCAOOAAGC TTiTGlWGC C3USATTGTCA OOUZTOAGGG AASTGATGGA 2220 

GAGGAGOGOC CGTCTTGCTC AGCDCAGCAG CCC3CC3*AGOC CAGGTAGC5C3C TGAGGOTCXX: 2280 

CoaWSCCTCT GCXCCAAGCT GGCCTGCOrGX GAiGTGGTGCA AGOGTTXXAA GAAGTACTTC 2340 

TGCKIGGTCC COGATTOTQA OgGCAOTACSC TGGCAGCAGG GCOIOCXCIQ CATCCai3K3TC 2400 

65 TAIX36CCIt3G ATAACTACTC GAGflGTTGTT TTCCCAOIGA CrrrCTTCTT CTTCAATGTG 2460 

CTCTACTGGC TTGTTTGCCT rnuvCTTGTAO QIACCAGCTG GTACCCTQTG GGGC31ACCXC 2S20 

TCCIAGTTCCC CAG6AGGTCC AAGCCCCTTG OCAAGGQAGT TGGGGGAAAG CAGCAGCAGC 25B0 

AGGhGGAlGGQ ACIAGAGTTr TrCCTTGOCCXZ ATIGCCCAAA CAGAAGCTTQ CAGAGGGnrT 2640 

GTCETTOCia CCXXITCrOOC CraCCTGGOC CATTCACrOA CRlWXCrCAG CAGAOCKTTT 27DO 

70 CAAAn?A«rrA AXAAATOGGC CIAGCTCCCTC TTCI1:CAAGG AGCATOCGTG ATGCTCMTTG 2760 

tTCAAAACCA CAGOCACTTA GTGAKAGCT CCCTAAAACC ATGCCTTAAGT ACAGGOQGAT 2820 

TAGCTATCT^ CCAACAAXOC TGAOCAOCAG ACAATTACrTG CATTTTTCCA OAAGCCCACT 2860 

ATTGCCTTTG CAGTGCITTC GGOCCAGTTC OSGGCTCAGC CTCAAAGTOC ACOGACTAGX 2940 

TGCTTGCCTA TAOCtTGGCAC CTCATTAAGA T0CTGGGC3US GMSTATAACA GGAGQMGAS 3000 

75 ATCOCTCTOC TTTGOTCAGA TTATTA3GIT CTCAGTTCTC TCTCOCTGCT ACCOCTTTCT 3060 

CTGCAGATAG ATAGACACTG GCATTATOCC TTTAGGAAGA GGGGGGGGCA GCAAOAQAOC 3120 

CTATTTGGGA CAGCATTCCT CTCTCTCTGC TGCTGTGACA TCTCCCTCTC CTTGCTGGCT 3180 

CCATCTTTOG TClGGACTAC CAATTCAATG CJOCTTCRICC AATGGGTATC TATTTTTOTG 3240 

TOTGATTATA GTAACXACTC CCIGCTTTAT ATGCCAGCCT CTTOCnCCC TTTGAOCCCT 3300 

SO GIGACfCTTT CT6TAACTTT CCCAGTOACI TCCCCIN3CC CTGACCAGGC ACTAOGOCTT 3360 

GOTGACTTCC TGOGGCCAAQ AAACTAAGGA AACTCGGCTT TGCA7WCAGGC ATTACTOGCC 3420 

ATTGATTGGT GCCCAOCCAS GGCACACIGT OGGAGTTCCA TCfltTTQCTT GACCCCTGOA 3480 

COCAT2UUIC3C AGTCCACTGT TATACCCOGG OCACTCTAAC CATCACAATC AATCAATCAA 3540 

ATTQGCTTAA ATTTGTATGG CACTCSGAACT XTGGCAAA6C ACTTTTGACA AGTTGTCTCT 3600 

1325 
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5 
10 
15 
20 

r 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



GA^TlGGAGCr 

AAGAGCCCAC 
TTCTCTTCTG 

GTAGTAACCC 
CXGTCriCTBT 
ATCATCIGAA 



TCAT6ATAGC 
CCCTOAGTCA 
TGTCACCTTC 
TCACxrrCTGC 
TCTGGGC3CTG 
AGTG6AATGA 
AATTTGTTGG 
ATGOOOAAAT 



cttgtgacat 
cagatttctq 

TAGACCaCAT 
TGGCACACCA 
AGGTQCrCAG 
ATTTGGACAT 
GGGGTOQATA 
ATGTAAATAI^ 



CTTTAfiGSCA 
TGGGACTGTG 
GATAGGOCTA 
GTGGCAAGGC 
ACTGCCCCCA. 
GCCCCAATGC 
GGGTGGGGTC 
ATATATCAGC 



ogatxcttat 
GATCTCACTG 
GACAGCTCAG 
CCAGAATGGC 
AOATCAAATC 
TTCTATATGC 
TCCATCTACT 
AAASC 



CCCCATTTTG 
GAAGCTATCC 
TTCACCATGA 
GACCTCTCTT 
TCTCCTGGCT 
TAAQTOAAAT 
TTTT6TCACC 



3560 
3720 
3780 
3840 
3900 
3960 
4020 
406S 



Seq ID NO: C20D DKA Sequence 
HucIedLc Acid Accession KK_Q218L9.1 
Coding fiequ^nce s 39 . > 1£19 



1 11 

I I 

mOGGGGQTC GOSTAKTTCO 
6TCCCTTATT CTGCCTTCTTC 
CTCCTCTACQ CAGGTTTGAQ 
GGGCTGGAAT ACCCTTCTGG 
GCaCTQACBCC ATCX».TaAC3a 
TCTCTG0CT6 GGMGTAdSAG 
ADOGCATaaC CaTGTGGTAC 
TGGGTTCGTG GGACGGCATC 
GTCCTGOCAT CCGTGTGCTG 
GAGCTAGCCA AG66CTGGGC 
GAfiCl^CtSGAT CftCCfACl^GG 
CCAGTGATOC AOTTGAGTTC 
TCTTTGGGGir CTCAGC&GCC 
TQACCTTCAO CCTQRQTQAG 
AGCAGCTCGG CCTGGOGAGS 

aGaaGaATGT aacigcaaaa 

ACCTGGaG(% GftOGCTGGGC 
AGCAOCTGGC CCAGG^TG&G 
GGCCTGACOG AGGCTGGSCC 
GTGQCCACCT CTQCATGTCA 
GACAGTOGAC TCTGCTG(»G 

cluaAaaccCsi. aorcrcxirrAC 

ACATCXTEGGG CCTCCTGCAG 
CCCQCCCACC TOQCCAGCCC 
TCTACCTGCT CATICAGACT 
ACAAQAGCSCr TCAQGAjQTQT 
CCCCCAGGGC CCTGGGGATT 
CCACCTCAGA GOCTGCFTTG 
TCAGTATCCr CTCGGTCTGG 
CCAOCTOGTT AAAGGTGATT 
A 



21 31 

1 1 
GCAOGAGGSC GCTC2U3GTAT 
CTCCTGCTCC TQGRCCC30CA 
TACAAGCTCA GCTTCAAAGG 
AGCCATCATQ GAGRCtSCCAT 

AAcaaoAGrG gcgcogtgtg 

GTGCAAATGA GQGTGAX3Q6G 
ACCCtSQGGCA GGGGCCATGT 
GGGATCTTCT TTGACTCTCC 
GCC3U30GACG GGCACATCCC 
TGCTOTCATT GGQACITCOS 
OGSCftGAOGC TGGGCATGTC 
TGTGTGOATG TGGGGGOCCT 
ACOGGCACCC TGGCAOATQA 
CCCAGCCCAG AGGTTCCXX:C 
CAGCTGGAAG GQCTQTOGGC 
irCAGACrCTG AAGCICAAGG 
2U3ACACXGCC GQATCCTGCA 
AGACAATGGA AGAAGCAGCT 
CrCGATGCTT CCTGCCAQAT 
CTCAATAAGG ACTCT6CCAA 
GCOCT6C3UU3 AQATGAGGGA 
CTGCCTGieG OCATTGAGCA 
GAGGAJSCTTC GQGGCCCtMC 
OCRaGQGC3CT CCTOGTGOCT 
GTAG6CTTCT TCGGCrACtST 

cnsicxACAG GcnsGCirac 

CTGM96AGGC AGGCTCTCCC 
CATCACnXXSG AAGC3U3GCAG 
GTGCCCftGCT CCCACSSCACA 
TCOCTCTCAA AAAAAAAAAA 



41 51 

I I 

GOCGGCBGTC ABTGGTCCAG 
CAGCCCTGAG AGGGGGTGTC 
CCCAAGGCTG GCATTGCCTG 
CCTGGGCCTG GAGGAAGTGC 
GAGCAGGGQC TCTGTCC90CT 
ACTGGGGGGC GGGSGA60CC 
AGGCi'd'GTC CTTOGOBOGC 
GGCRGAGGAT ACTCAGGACA 
CTCTGAGCRG CCTGGGGATG 
GAACCSSCCA CACOCCTTCA 
CTTGAACnGT GGGCtCftCIC 
GCTTTTGCfTC CCTGGAGGTT 
TCATGATGTC CTGTCCTTCC 
TCAOCCCTTC CTGQAGATGC 
AAGGCTGGGC TTGGGCACCA 
AigAAGOGSAA AQBCTCrT'TG 
GGCTCTGGG6 GGTCTCTCCA 
GGGGCCCCCA GOCCAAGCCA 
'TCCATCCACG CCAGGGAGGG 
GGTOQC3TGCC CTOCTCI2R1X3 
TGCAGCTGTC CGCATGGCTG 
TCAITTCTTA GAGCrGGACK! 
GAASGCAGCA GCCAAGGCCC 

GCAGCxrroac atcttccigt 

GChCITCAOG CAGGAGCTGA 
TCTGGOrCCI GCACCiACACA 
XGCCAfiCATG OCTGCCIGAC 
TGTCTTGGGT GGaaSCTTGQ 
CClGAfiCITT C3GGCATGCTC 
AAAAAAAAAA AAAAAAAAAA 



Beq ID KO: C201 UNA &equence 
Nucleic Acid Accession #s 3eH_117D36.1 
Coding sequence: 



1' 
I 

AGCC3UU?AGA 
CGGCCCCCAC 
ATGGCCTGGC 
QGCCCQTGGQ 
GGACGCAGOC 
GATGAGOGGG 
TGTCACCXJCC 
GCAGOCGGGG 
TOGGGATGOa 
CGGAACTGT6 
TTACACAGTO 
CGOCTGCTCA 



11 
I 

6GGGGGACAG 
CCOCACISCC 
OQQOCACOCT 
GGCVGTTCCG 
OOQAAACGCA 
AACOGGGOGG 
ACAGCGGCCG 
AGACAGGGAG 
GCTGAGGiITC 
CACGTTCAOf 
AACAAGAGCA. 
GAGCCGGGAG 



21 
1 

ACAGATGGAA 
AGCfCIGAGA 
CTGGAQACAC 
ACCACACAGG 
GGGTGTCQCC 
GGGOGCCTAC 

CAcaoraOGU 



ttcctgaagg 
gg^tgaacgot 

OGAHGOGTGG 
ATGGOQ^CTC 



31 
1 

AGACGGACAC 
CAT^nGTQCA 
ACATGQC&A6 
TGGCOGAGGO 
CACGTTTGCA 
GCAATGCAOG 
GGCrCTGTXT 
GTOaOOGGG 
ATGAATGCTC 
GTGGGGAG7G 
AGGOCKCTTG 
AQCTOCITTA 



41 

I 

GGGGAGCTCT 
CAGGCCTQCA 

AQGCAGGGCA 
GOCGCGCTOG 
TTACGGCSGCS 
CCIGGAAAOG 
GAAGIGSGGA 
TOmACTGIQ 
AATOCATCTC 
TCTGCTCCTQ 
ACIG 



SI 
I 

GGGGAGCAGG 
GGOGGCXGGS 
GGCAGOOBiAG 
GSCCGCGGTG 
OGTGTCOGTG 
GIGQAAGGGG 
70CAGAAJCAG 
^CGGGAATGAG 



JACAGAGCTG 
CftC3ATGGAf3C 



Set] n> MOt C202 13IHA q qq uj gn cB 
Nucleic Zkcld Accession #s XHJLC7803.2 
Coding sequences 1X^2.. 1468 *~ 



3. 
I 

AACATCATAC 
AGTCACCCAC 
QTOGGCOCCA 



GCACCACCGQ 
CTTOGCCCTG 
TCCCGAGGCC 
GGTCTCTTCC 



ITGGGCOGAGG 
CGGXCTAGOG 



11 
I 

ATAGrAGGTG 
GTTCCTTOTA 
ACTGCATCCT 
ACCCCaSAGGC 

CCTGCCCOGC 
GGGATCCAAC 
OOCXZCACAGA 
CAACCAGAGC 
GCAOCA06TG 
6C0GG3TAGG 
CAATAGGASA 



31 
I 

AATOGTTTTG 
AATCCAAATQ 
CCTCTTXGGC 
CTCGTGGACG 
CTCCACrCCT 
GGAAGAOCGC 
CAACTTOTAT 
GCTCTTAGOC 
CCACCCAGGC 
CAOGTG6AGC 
CXavSCACOQS 
GCAGAGAATG 



31 
I 

TAGAGTGAAO 
TTTCTATATT 
GGGCTGGGGA 



CTCETQGATG 
CCAGTGAGIX; 
CGAGTGGGGG 
AATCCTATGC 
ACATTAGC6A 
TCIGOGATGC 
AGGTAAGGGT 



41 
I 

AATGCTAATG 
GTAGCTTTGC 
GCGGCCCCCA 
TGTGGTGGGA 
OSCCXCOSTV 
AGCOCXOTCC 
GGGCAACOQC 
AGAGAGCATC 
CCAOGCTTGG 
TATGTTGGGG 
GGTATGGGAT 
AGGGTGGGGG 



60 
120 

lao 

240 

3DD 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

ISOO 

1560 

1620 

1680 

1740 

laoo 
laoi 



51 

I 

CAAAGCAAAT 
7TAAAATGGG 
600GGGACGG 



€0 

120 

lOO 

240 

300 

360 

420 

480 

540 

600 

660 

704 



XCCfGTOCCC 
TTCCAGGOGC 
TOCCCATTTT 
TCCTQQCAGG 
GCTTCOCCAG 
CGGCAAGCGG 
GQ6GOGGGGG 
GASGGGACGG 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 



1326 



wo 03/042661 



PCT/US02/36810 



5 

10 

15 

20 

25 

30 

35 

40 

45 
50 
55 
60 
65 
70 
75 
80 



GAGCXTTGOC 
CCCCATCTGT 
CAOCATCACC 

AGCXAGCTGA 
GGCTTCGTCT 
TTTCTTGTGA 
GCACCTGCCC 
CCAjQGGGQQC 
GAGTGCAAGT 

ACAAAATTO 
TATTTTT(?TG 
CGT6AAAAAG 



ACCATCCCAG 
AAAATGATGG 
TGACCTGbOS 
GCBAATCCTG 
06CCAACGCA 
GCCA.TCTTTG 
CCATCTCCCC 
TCTGGCACCA 

TcxxxaoAjcac 

GTTTTATAAA 
GAAATGAAOA. 
TCCAOCACaC 
GGAATGOOCX; 
T 



GACTTTaOGC 
TAATAATACT 

gtcaa<3G<::ag 
actgcaaggc 

GGCCCATCCG 
TGG6TGCCCT 
ATGAAAGTGC 
GATGOCAGGO 
COCA0GGGIG 
AAATAAAACA 
AAAACTGAAG 
ATACXXTCTA 



AAGTCACCXX3 
TCACXlTACCr 
QAG6ACTCOG 
CCACCTGCCC 
TCICTTCRCT 
AGACTTAGTC 
TGTACAJVATT 
AAGGGACAGA 
GGGATTGSTB 
AAAACCCAjCC 
AAAAAAAAAA 
AGCX3U3CAAG 
TGGCAAG60C 



CaCTCCCTtsG 
CATAGGGGAG 
AAGGT6CTAC 
CTCCCCCACA 
CT6TOGCAGG 

CCACCCGCCC 
GGAAAACAGC 
□OCACTGTTT 
ATCAC2UVAAA 
AACAOGAAAA 
ATTTCXITCTT 
T6TTCTGATT 



GCCTCGGTTT 
GTTGTGAQGC 
CCGTO&QCAA 
GAGCCTCCAG 
CCCTTTCATG 
CCTGGTTTCX: 
CAGGACCCOC 
CACAAACAAG 
GTA1GTTCTT 
AAAAAATTTT 
AAAG2VACCAT 
TGCAAAATCA 
AATAAAGGAT 



TBO 
840 
9DD 
96D 
XD2Q 

loao 

XX40 
1200 
1260 
1320 
1380 
X440 
150D 
1511 



Seg ID HO: 0203 DNA Sequence 

Nucleic Acid AcceBsiaa Mt MM_&247B0.1 

Goding segaence: 31,. 10 23 



1 
I 

ATTAATCTGG 

GGCATTCTTT 
ACCCTC3VrTG 
AATTCCTTOC 
GGCCTTQUCG 
GTQTQGATTa 
ATCAXGTTCr 
TOGCajOGCCT 
GGOGrCTTGT 
CX2TTTTCQAQ 
ACACQGCCTG 
TTCTTTTTCA 
GAGGGAAGGA 
AAAATGTXCC 
AGCTCACTTG 
CATGATGGCA 
TGATGACTCT 

CAG6AGC6GA 
TTTAATACCT 
GTTGTAATAT 
ATTATCTTCA 
TCCXSAGAOCT 
CATGTAATAi: 
ACTTTGAACT 

GOOCIGTGAG 
AGTACOQTAa 
ATAOCCSCACC 



11 
I 

CCGTGCXATG 
AAQTCTACGT 
GTTACTATTG 
GCi:y\i5GAiCAT 
TGGGGGRGTT 
AGTTTG3UC!AT 
GCATCTTCTT 
ACrCCAAAAA 
CACAGATGAT 
GCACCCTGGC 
OTCTGCCTCT 
GCTACCTGTG 
TCJCTCACCCT 
AGATTATGAT 
TGATAOAAAA 
TTCTGGAAAG 
GT C TTC37tL.-r"i- 
TTTGGTAACC 
TQCCACCTGT 
AACIGACTAC 
QGATTGCXOA 
TGAAA.TGAGC 

GGCAAAGGTT 
TOCAGCAGGG 
AAAGACArrCC 
TTGOGTACTO 
GTGTTATACA 
OCAATGAOAT 
GIGATGftlCTT 
AAS 



21 
I 

CATCTACTCC 
TCTCCTGATC 
GCrCAACACC 
CTACCX3GCTC 
TCTGAGOAGA 
TGCCAGGAAC 
CLQCCCCCXG 
TATCAGGCTG 
QACTTTCTTC 
CATCACCATC 
CTTCATTCAC 
GGTTGTTTGG 
CATTQTQCTA 
AAGGCTGCTC 
ATTGATCAAG 
GAGAGAOGTG 
GCGATCXAGA 
AGACAOCAM? 
GCCTTTAGGA 
CATGTAATTA 
ITTTTCAAGA 
CIACAAAAAC 
G<3UX3AATGA 
TAGAAACIGT 
ATATGAAGCC 
CGCATGrOCC 
ATTACATGAG 
TGAOCATCAA 

TTA:rcctaiu3A 

GAAATATAAT 



31 
1 

ATOPTCAQGC 
CGAAACATCT 
QTQC3COCTGT 
CTTCTGATGO 
ATCATTGGGA 
GTTCTAGAAC 

ATGATBAATT 
ATCTTCTTGC 
TGGAOATTGA 
TCCATCTACA 
ATCTATOQQA 
ATCATCftCCT 
CATGAOCAGA 
CTGCAGGRTA 
GAQCAACAAG 
AGATCAGTTC 
CAAAX2WGG6 
ACTGCCCAGA 
TCAAAGTAAA 
CAAAATACTT 
CTAGGAAflAO 

TGCTAAGAAA 
TCTGAATTGT 
AAAOGTGGAA 
TCTTTGGAGT 
AGTOCXACBT 
CGCXaATTATT 
CAGOGCTGGC 



41 

1 

TTGTOGAGAG 
TTTTGAAAAT 
CTGGTGAAGA 
ATTTrGTOTT 
TGCAACTGAT 
TQATCTATGC 
TCCAAATOAT 
TOCShSCCTGC 



AGCCTTCAGC 
GCTGGATOQA 
ACCTCATTGG 
ATCTTTACTG 
TCAXXAATGA 
TGGAGAAGAA 
GCITTTTGCA 
AA0AAGGTAA 
CauSGAGAQGA 
AGAAAATCCA 
ATTGGGCATT 
GGGGTTTTCC 
ATAACTAfiOG 
GAAATCTCTC 
AGTGQTCCAT 
AGAACCTGCA 
TACSIACCAQA 
OGGGGATGGA 



SI 

i 

GTAOGAGATG 
ATCAATOlTT 
GTOTTGG6AA 

ctcittagtc 
cacaw3tctt 
acaaactctg 

TATGCTTOTC 
GftSCAAAGOC 
ATCCTTCACX: 
TGACTGTGGC 
CACCCTAAGT 
AAGTGT6CAC 
GCAGASCSMCA 
GGGCAAAK3AT 
AGC31AACCCC 
TTTGGGGGAA 
TOCAAGGGCC 
AAATGQAATG 
AGQCTTTAQC 
CCATGCIATT 
AATAAAQATT 
AATAATGTAT 



GCTAATTOGA 
AATTTTTGAC 



CCTGAATAAA 
TTTATTTGTG 
GGTCTCATCT 
G6AIGGTTCTG 
TTGCftGTQGG 
AATTTTOCCA 
AGTCCCTAGG 



seq 2D SO: C204 Protein Sequanca 
Bxotein AcceBsUm Us Bob sequence 



1 
1 

AGLXJUUG'X'l'A 
ATCCIGGCAT 
CXCTGGQAGT 
CAGTGGA6RA 
TCATCATACA 
TTTTTCAAGA 
TTGATGTGAT 
TGOaOTTTTC 
TGGGCATAGA 
CC3CITATAGC 
TGOAAGAGAA 
GTATCCTTCA 
GATCCSU^GAA 
AGATCATTGG 
TETTGAAGTT 
TTAOOGTGGC 
GTTATlTTAG 
GGAACAGOGG 
arGACCACCTG 
TTAATGCCCA 
TCACTCAIGA 
AOCTGTCAGA 



II 
I 



COGTrTTA'l^A 
OGGIGCTGTT 

cxATGrrcAs 



TTIAAAAATA 
GCXIAATCAAT 
GAAGGCTTGC 
OGTTGAGAAT 
AAAGSTGCTG 
AlSACl'CrCAG 
AACATCTTTG 
CACTCCXGGT 
CAT6ACATCC 
AAOQAAQCTI^ 
GGTCARITGC 
CAGCCTGTOG 
AGGCARGTTT 
CAACATTTTC 
CAAAAAGAAC 
OGACACAGTG 
GGCATCCTAC 
CTTCTTCAGT 
CAlTTACXCX! 
AAAAGCVGTB 
GCTCGGTCAG 
CAIQSTAGCC 
CCTGGCn3AG 
ACIQOCTTTC 
GCTTGTGGAO 



2^ 
I 

GrTATQTTGT 
ATGT6AATGG 
GGQGGTGTGA 
AGGA GGCAGG 
GTTTCCTCCG 
CTAGACICAA 
ACAGTCAGCA 
AATGAGTOGA 
TCTTCACATG 
AGAQACAOAA 



31 
I 

GATTTGTGIA, 
TGTATOCCTT 
CTGCrGTATG 
AOATGCTGTC 
GGGIGCAAAG 
GCATCAACAT 
AAAGGAATGA 
TGTClCaGAC 
AAACIGTTCA 
TTAAAQCCAT 



41 
I 



TGTATTCAGT 
GAAATTCTGA 
GGAACCAGCG 
TCATTCATCC 
ACOCTCCAGT 
ATGTACTATG 
AACATGCAGC 
TTGCTGTTCA 
OGAOGGATCA 
AAGCCAAAAC 
GAGAATTGCA 
TQGGTTGTCT 
TACAACTTAG 



GTCI6AACTC 

ATTCCATCAG 
TCCTCTOCTA 
TGAACCTCAG 



GcrrrxACAC 

TGGCCTACAT 
GCATGGCCAA 
CX3UU3CTQAT 
AGAAGAATCT 
AOTIGACaTT 
CTACAGGAGT 
AOTTCCTGAA 
GCATTAATCT 
T6CCAGGGCA 



TOTBATCATC 
AAGTCTC7VGG 
OOATGACCAC 
OCACOCATCC 
GGTTTTGTCA 
CCAAABGGST 
GCTTCATASC 
AGGACAQAAG 
CStiSGAAOCAG 
AAAAAOCAAA 
<31TTTCCX3GG 
CCTGTGGCAG 
TTTCAACTTT 
CTTCATCAXA 
GGAjSTTTTTC 
CAATTCCACC 
CTTCACAATC 
GTATTTCCC3G 
CTTTTGCTG6 
lAGCACTGAG 
CAATCAGCTG 
GGCCATAGCC 
GACACACAGT 
GGCCGIGCCA 
OGAAGTCTAC 



51 
I 

TGGTGAGCTC 
ATCACITITT 
AAG<XX»CCC 
GTGAATGAAA 
3CAAATCAQA 
KIAXCIGACA 
AACCAGGIGC 
CTAGAATGCA 
TTAATCC3CAT 
CCAAGGAGCA 



GCTXATGQGA 
AAGAOGCTGA 
CTGAGAIGGC 
ATCCCTCAGT 
ACXG0QSTG6 
ATCGAGCAGG 
GGAGCa^TGCT 
AACAACTTCA 
GACTTCACTG 
ATAAGGGftOA 
CTGACCOGCT 
TGCTGTGCAG 
AACCCTGGGG 
TGCATCTACT 
GTTCrCCTGA 



60 

120 

180 

240 

300 

360 

420 

480 

?40 

600 

€60 

720 

700 

840 

9D0 

960 

1020 

1080 

1140 

1200 

12 60 

1320 



1440 
1500 
1560 
1620 
1680 
1740 
IBOO 
1813 



60 

120 

180 

240 

300 

360 

420 

480 

£140 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 



1327 



wo 03/042661 



PCT/US02/36810 



TCCGAAACAT CTTTTTGAAA ATATCAATCft. TOJOSCATTCT TTGTTACTAT TGGCTCAAO^ 1620 

CCSTGGCCCr GTCTGgTOAA GftGTGTTGGG AA3U3C3CTCRT TOQCCAGOAC ATCTACCGGC 1690 

TCXTTTCTGAT GGATTTTGTG TTCTCTTTAB TCAATTCCTT CCTGGGOGA6 TTTCTGAGGA 1740 

5GAATCATTGG {3ATC5CAACTG ATCACAAGTC TTGGCCTTCA GOAGTTTGAC ATTCCCAOGA 1800 

ACQTTCTAGA ACTGATCTAT GCACAiACTC TGGTGTGGAT TGGCATCTTC TTCTGOCCCC 1860 

TGCTGCOCTT tATCCAAATQ ATTATQCTTT TCATCATOTT CTACTCXAAA AATATCAGCC 1920 

TGATGAT6AA TTTCCAGCCT OGGA6CAAAG CETGGCOQGC CCCACAGAT6 ATGACTTTCT 1980 

TCaTCTTCTT GCTCTTTTTC CCATCCTTCA CCtSGGGTCTT GTGCACCCTG QCCATCACCa 20 40 

^ TCrGGAJSATT GAAGCCTTCA QCIGACTGrG GCCCTTTTOQ AGGTCTGCCT CTCTTCATrC 2100 

lU ACTCCATCTA CAQCTGGATC GACACCCTAA GTACACGGCC TGGCTACCTG TGGGTTGTTT 216fl 

GGATCTATCG GAACCTCATX GGAAGTGTGC ACTTCTTTIT CATCCTCAjCC CICATTGTGC 2220 

TAKTCATCAC CTATCTTTAC TGGCAGATCA CAQAC30GAAG GU^ATTATG ATAAG6CTGC 2280 

TCCATGAOCA GATCATT^AAT QAGGGCAAAG ATAAAATGTT OCTGATAGAA AAATTGATCA 2340 

^ _ AGCTGCAGGA TATGGAGftAG AAAGCAAACC CCftGCTCACT TGTTCTGGAA AGGftGAGAGG 2400 

ID TGGAQCAACA AGGCTTTTTG CATTTGGQGG AACATGATQG CAGTCTTSAC TTOCOATCTA 24€0 

QAA6ATCAGT TCAAGAASGT AATCCAAGGO OCTQATGACT CTTTIOGTAA CCAGACACCR. 2520 

ATCAAAT2U«S GGGRGGRGAY GAftftATGOiVA TGATTTCITC CATGCCRCCT GTaCCTTTAG 25B0 

GAACTGOCCA GAAGAAAATC CAAGGCTTTA GCCfiGaAQCa GRAACTGACT AOCATGTAAT 2640 

TATQWUWSTA AAATTGGGC!A TTCCATGCTA TTTTTAATAC CTGGATTGCT GATTTTTCAA 2700 

/AJ GACAAAATAC TTGGQGTTTT CCAATAAAGA TSPGTTGTAAT ATTQAAAWHA ItHHffAMAAAA 2760 

ACCTAGGAAO AGATAACTAG OGAATAAmS^I? ATATTATCTT CAAGAAGTGT GTGCAGQAAT 2820 

GATTGGTTCT TAGAAATCTC TCCTOCOVeA CTTCCCSAGAC CIGGCAAAGG TnAG&ARCT 2B60 

GTTGCrAAGA AAAGTGGTOC ATCJCTGAATA AACATGTAAT ACTCCAGCAG GOATATSAAO 2940 

CCTCTGAATT GTAGAACCTG CATTTATTTG TGACTTTGAA CTAAASACAT CCCCCATGTC 3000 

ZD CCAAAGGTGG AATACAAOCR GAGGTCTCAT CTCTGAACTT TCTTGCQTAC TORTTACaTG 3060 

AGTCTTTGGA GTC3QGGGATG GAGGAOQTTC TGCSCCCVGTG AGSTGTIATA QKSGMXXXO 3120 

AAAGTCXTTAC GTCAAGCTAG CTTIGCASTG GC7U9TACC30T AGCCAATGAG ATTTATOCX3A 3180 

OAGGCGATTA TTGCTAATTa OAAnTTTTCC CftAlAOCCCA 0CGTQAT6AC TTGAAATATA 3240 

ATCAGCGX^ GCAATTTTTG ACAGTCTCTA CGG&6ACTGA ATAAG 32 8S 



30 



75 
80 



Seg ID NO: C205 vm. Sequence 
Nucleic Asia Aocesaion 1IKJ>022SO. 
Coding sequenea: 397.. 1690 



35 1 11 ai 31 «i SI 

I I I I I I 

GTGCITGGGT GTCTOSSTGT GSTGAGTAGA GGTGTGTGTC ACAAAGTACA GACCATIGTG 60 

MTGACAAAG COCATCGTGT GTCreTGrGX GTCTTTATOC ACGTGGATGG AOSTCTCTTT 120 

- - CTTGCrCTGC CCCAftGACAC ACCCTAGOOC CTCCTTATTC TCAAAAGGGG GIUSCTGGGGA 180 

H\} GCX:tCXXXX:t ACCCTGGGGC CTCXXCTGCC OCTCOCCGCC CTGCCTOGOC GTCACCACTC 240 

COCAGAGGGC ACASGGCTCT GCTGTGOCTC AGnSCAAAAS TOOCAGAQCC AGCAGAGCAG 300 

GCIGAO^ACC IGCAAGCX^C AGTQGCTGCC CTGT6CGTGC TGCGHGCniGG GGGAGCCTGG 360 

GCRGGAAGCT GGCTGAGCCC CAAGACCCCO GGGGCCATGO GtSCSGOGATCT GOlGCrtGGC 420 

CrGGGGGCCT TQRGAOaCCG AAAGC3GCTTG CIGGAGCAGG AGAAGTCTCT GGCC5GGCTGG 480 

*l-J GC3«rrGGTGC TGGCftGGAAC TQGCaiTTGGA CTCATQGTGC TGCATQCAGA GATGCTOTOG 540 

TrOGGGGGOT GCTOQTQGGC GCTCIAQCZQ TTGCCGGTIA AATOCAOGAX CAQCATTTCC 600 

hOCTTCTIAC TCCTCTGGCr CATOGTGGCC TTECATGCCA AAQAGGTCCA GCTQTTC31TG 660 

ACCGRCAACS QGCTGC3BGGA CTOGOiOaTQ GC3aCTGACX3G GGC3SGCAGGC GG06CAGATC 720 

GTGCTGGAGC TGGTQ6TGTG TGGGCTGCAC CCGGOGCXXM TGaaOGaCC^C GCaSTGCGTG 7B0 

DU CAX3QATTTAG GGGOGCaZGCT GACCTCSCDCG CAGCCCTGGC CGGGATTOCT GGGCCAAGGG 840 

GAAGOGCTGC TGTOCCTGGC CATGCTGdG OQTCTCTACC TaOTGCCOCG dSQQSSSQCiC 900 

CTG0GCAX3CG GC33rC!CrGCT CAACQCT^rCC TACGGGftSCA TCBGOBCTCT CAATCAAGTC MO 

CGCTTCCGOC ACTGGTTOGT GOCCAAGCTT TACATGAACA GQCACCCTGG CGGCCIGCIG 1020 

^ CTOGGCCTCA OGCMOGCCT GTGGCXGACC ACCGCCrGGG TGCTGTCCGT GQCCGRGAGG 1080 

DD CAGQCWTTA ATGCCACTOG GCACCTTTTCR GACACACTTT OGCTG&rCGC C&TCACATTC 1140 

CTGACCATOG GCTATGGIGA 0GIG6TGG0G GGCAGCATOT GGGGCAAOAT CGTCXGCCTO 1200 

TGCACTGGAG TCATGGGTGT CTCCTGCACA GGCcnaCTGS TGSCCGTGGT GGCCCX36AAG 1260 

CTGGaGTTTA ACAA^SCAGA GAAGCAOSTG CACWVCTTCA TGATOGATAT CCAGTATACC 1320 

AAAGAQATGA AaOAOTCDGC TSCCCPflGTO Cl'ACAAGAAG CCTGCATGIT CTACAAACAT 1380 

ACTCOCAGGA AGGAGTCTCA TGCTGCCCX5C AGGCATCAGC GCAAOCXGCT GGCCGCCASC 144 Q 

AAGGGGITCC GCCAGGtGCG GCXGAAACSkC OGGA2^CTCC OGGAACAAST GATUTECCATG 1500 

GTOOACATCT CCAAGATGCA CATGATOCTG TATOAOCTGC AGCAGAATCT GAGCAGCICA 1560 

CAC CX3GG O0C !tGGAGAAACA GATTGACAGG CTGGOGGGGA 7U3CTOSATGC CCTGACTGAG 1S20 

g-c CTGCTTRGCA CTGCXZCTGGG GGG9AGGCM CTTCCAGAAC CCRGCCAGCA GTOCAAGTAG 1680 

Oj CTGGftCCCAC GASGAGGAAC CAGQCTACTT TCOCCAGTAC TGftGGTGGTG GACATCGTCT 1740 

CTOCCACrCC TQACCCAQCC CTGAACAAAG CAOClCAAGT GCAAGGACCA AASGGOGOOC IBOO 

TGGCTTOGAG TGGGTTGGGT TGCTGATGGC TGCIGGAGGG GAOOCTGGCT AAAG3GGGXA 1660 

GGCCrrCGCC CACCTGAGGC CXX»GC3W3QQ AACAUCGGTCA CCGCC2ACrCT GCATAOCCTC 1920 

ATOVAAAACA CTCTCACTAT GCTGCTATGG AOGACCTCCA GCTCTCAOTT ACAAGTOdG 1980 

' U GCXSACTGGAG GCAGGACTOC TOGGTCOCTG GGAAAGAQGG TACTAGOGGC COGQATCC2U5 2040 

OATrCTGGGA GGCTTCflaTT ACCiGCSrOGaC GAGCTGAAGA ACIGGGTAT6 ASSCIGQGGG 2100 

GGGC3CTGGAG GTGG0600CC CTGGIGGGAC AACAAABAGG ACACSCATTTT TOCAGAGCIG 2160 

CftGAGAGCAC CTGGTGGOGA QGAAiSAAGIG TAACTCACCA GCCTCTGCTC TTATCTTTGT 2220 

AATAAATGIT AAAGCXAG 2238 



Seq ID MO: C206 DNA Seqfueacc 

Nucleic Acid Accession ft: 1IM_025257.1 

Coding geQuence: 1..2139 

1 11 21 31 41 Si 

I I I { i 1 

ATGGGGGGAA AGCAGCSGGGA OGAOGATGAC OAGaCXTTACa GGAflGCCAGT CAAATAOGAC 60 
CCCrCCTTTC GAGGCCCCAT CAAiGAACAGA AGCTGCACAG ATGTOITCTG CIGOGTCCTC 120 
TTCCIGCTCT TCATTCTAOQ T1ACAT0GT6 GXCGOOATTG lOGOCTGOTT OXATGG&GAC IBO 

1328 



wo 03/042661 



PCTAIS02/36810 



TCCECTACCC CAGSAACTCT ACTGGOGCCT ACTGTQCSCAT QGQOGAGAAC 240 

AAftGATAAGC COTATCTCCT CTACTTCAAC ATCTTCRGCT GCATCXITOTC CAGCAACATC 3 DO 

ATCTCAGTTG CCTACAGTOC CCCACRCCCC AGGTGTGTGT OTCCTCCTGC 360 

CCGGflfiGACC CAT13GACTQT GGCaAAAAAAC OAGTrCTCAC AOACrTOTTOG GGAAGTCTTC 420 

5 TATAC3«AAA ACAGGAACTT TTGTCTGOCA GGQCSTACCCT GGAATATGAC OGTGATCACA 480 

AGGCTGCAAC AGOAACTCTQ GCGCAGTTtC CrCCTCCOCT CPQCTCCAGC TCIGQGACQC 540 

TtSCTTTCCAT GGAOCftACAT TACTOCACCQ GCGCTCCCAG CGATCACCAA TGACACCACC 600 

ATACAGCAG6 GGATC3VGCGG TCTTATTQAC AGOCTC3UVT6 CCOGAGACAT CAOTGTTAAG 660 

ATCTTTGAAG ATTTTQCCXIA GTCCXGGTAT TGGATTCTTG TTQCCCTGGG GGTGGCTCTG 72 0 

10 GrCTTOAOCC TACTGTTTAT CTTGCTTCTG CGCCTGGTQG CTGGGCCOCT ©QTGCTGtSKtG 780 

CTOATOCTG6 GAOTQCTTGOG GGTGCTOGCA TATI30CATCT ACTACTQCTG OQAG8AGTAC 840 

ClC3AaTGCTGC GGGAOWOGG CX30CTCCATC iTGCCftGCTGa OTTTCACCAC CAAXXTCAGT 900 

GCCTAOCAGA GOGTGCAfiOA QACCTGGCTG GCOOCCCTGA TCGTGTTGGC BGTGCXTGAA 960 

GCCATCCTOC TGCTGGTGCT CATCTTCCTQ CGGCA5CGGA TTCGTATTGC CATCGCCCTC 1020 

15 CTSAAQGAQG CCftGCftAGQC TGTGGGACRG ATOATGTCTA OCATGTTCrA CXXTACrGGTC 10 8 Q 

ACCTTT8TCC TOCTCCTCAT CTGCATTCJCC TACIGOQCCA TOACTGCTCT GTATCCTCTQ 1140 

OCXSU3GCAGC CAQCXaCTCT VGGA'CATGTG CTCTOOGCAT CCAACATCAK3 CTCCCCCGGC 1200 

TOTQA6AAAG TGCCAATAAA TACATC!RT6C AACOCCACGG CCCACCTTGT GAACTCCTCQ 1260 

TGCOCAGGGC TGATGT13CGT CTTCCAGGGC TACTCATCCA AAGGCCEAAT CCAACGTTCT 1320 

20 OTCTTCAATC TQCAftATCTA TGGGGTCCTQ GGGCTCTTCT GOACXCTTAA CTGGGTACTO 1380 

GCCCTGGGCC AATOGGTCCT OGCrGGAGCC TTTQCCTCCT TeEACTGGOC CTTGOiCAAB 1440 

GGCCAgonCA TGCCTACCTT CCCCTTAATC TCTGOCTTCA TCCSCACACT COGTTACCRC 1500 

ACTGGGTCAT TGGCATTTGG AGCCCTCATC CTGACCCTTG TGCAGATAQC CCQC3GTCATC 1560 

TTGQAiSTATA TTGAOCACAA GCTCRiORGGA GTGCWJAACC CTGXAC50CCX5 CTGCATCATQ 1620 

25 TGCIGXTTCA AjQTGCTGCCT CTGGTGTCTG GAAAAATTTA TCAA<3TTCCT AAACOGCAAT 1680 

GCSXACMTCft TGAiTOSCCAT CTAQOGGAAG AATTTCTGT6 TGTCAGCCAA AAATaOOETC 1740 

ATGCTACTCA TQOGAAACAT TGrCAGGOTG GTGGTCCT66 ACAAAGTCAC AGACCTGCTG IflOO 

CTSTTCTTTG GGAAGCTGCT GGTGGTOGGA GGCGTOGGGG TCCTGTOCTT CTTTTTTTTC 1B60 

TCCX3OTC3HCA TCCXX3QGGCT GGGTAAAQAC TTTAAGAGCC CCCAGCTCAA CTATTACTGQ 1920 

30 CTGCCCATCA TGACCTC3CaT CCTGGGGGCC TATGTCATGG GCAGGGGCIT CTTCAGCGTT 1980 

TTOQaCATGT GIGIGGACftC GCTCTTCCTC TSCTTCCTQCl AAGACCIGGA G06GAAC3AC 2040 

GGCTCCCIGG ACOGGCCCTA CTACKTGTCC AnSAGCCTTC TAAAGATTCT GGGCAlUaAAG 2100 

AAOC3AGGC33C CCXOSSACAA CAAGAaOAGG AAGAW3TGAC AOCTCJOGGCC CTGATCCAGG 2l€0 

ACXGCACCCC AOCCCCACCO TCCFUSOCRTC CAACCTCACT TOGCCTTACA GQTCTCCMT 2220 

35 TTOTGQTAAA AAAAGGTTTT AGGCCAGOCG COGTGGCTCA OQOCIGIAAT CCRACACTTT 2280 

GZVSAGGCTGA GGOGGGOBGA TCACCTGA6T CSbOGaGllTCG ASACCAGCCT GGCOUUCATG 2340 

GTOAAAC 2347 
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Seq ID HOt C207 DtJA Sequence 
KUcleic AcjLd AccesBion #s MMJ0161BO. 
Cbding Bequenjce : 26.. 16X6 



1 

i 

CAGGAAGGTT 

CXAIAAATCC 

CAOCACSACTC 

GSCA6CGTAT 

TGTGTOQTTC 

CGACCACTGC 

GATQCrCGTG 

TRAOCCAAGQ 

TQATTTTGCT 

OCRTCRGGAC 

C3C3X3GGITAC 

TACAOftATTd 

TCATCTGTGC 

GCAAACCCCT 

GAAAGTTAAA 

TCATGCrOAA 

CKIGCCIOCT 

<3XCCAACAVG 

TAGTGCaVCAC 

GGQCTTCTGC 

ATCC^ACATT 

GGGATITATT 

TCSTAATOTCC 

GGAAGAAAAG 

GGGCATGGAC 

TGGC^CIGGGC 

tggo6toqk:a 

CAATAAAQAG 



11 
I 

CCTCTCaiCAG 
CTftGCTQATG 



21 
I 

TGGOCATGGG 



GTGACXJCCRG 
CTCAGCCCCA 
OGGTCCAOQT 
G6CATGGCTC 
AOGAAGCTG3 
GCCGACTTCA 
AAGOAlSAAGG 
CTTTTGGGTG 
CAOGIrCATGT 
AGTATCTCTG 
CAQGACCCTC 
AATGGTTACG 
ClWSVCTCGCA 
CACXACCGCT 
CIGTTCTTCA 
AACTCCACAG 
ATCAACTTCCG 
GGAT^AAAGG 
GG6CTCTTOC 
AOCACCCXGr 
GAGAGGCACC 
TODGCGRCCC 
TOXrrOGTCA 
CIX3ATA£lBCr 
ACAAT6ACCC 



GGA-XOGCCAT 
TCCTGCTCAG 
TCCTGQGATT 
GGGGCC6CCS 



TTTGGGOCAT 
TTGATQGGOC 
GCCTCXSICTA 
CTAHAGACTG 

AAiaOCCGACr 
CATTGTCATC 
TAAA70CAGA 
GGGWVTGAC 
AOCTTTGCRT 
CAI3ATTTCAT 
AGTTTCTCAT 
TQTTTTCCTC 
GTCTTTACTT 
OGAAXCTCTA 
ACACTGZOCIC 
AC3GGGCCAGG 
TCACATOCRT 
ACACAGC0G6 
GXTGCmGT 
XftAAAAAMA 



31 
1 

TAGCAACAGT 
TGACTCTOnxS 
GTTCXSGlAaGA 

CCIGCTGCAG 
GAQACCXZZAC 
TGCOGCUUTT 
AAG1?STCACC 
CATCAAAGOC 

ccatgccctc 

GGOOCATCTG 
TGCATTGGTG 
TACnSAGOTT 
AGATQGAAT6 
GCTGGCAATG 

CAGCCACCTC 
GGGOCAGATT 
CTAQGAAAGA 
ACWTATTCT 
CAGGGGATAT 
CTCCACCCTG 
CTTTAACGTC 
AGGGGACOCA 
G6TGCAGCTG 
GAGGGTTGTC 

OQCTcrcmT 



41 

I 

GGGCAGGCTG 
GAGCOGCCTA 
GAOTTCTGCT 
CCCAGCAGCC 
CCCGTGGTOS 
ATGCTCACCC 
OTTGTAGCJAG 
ATGATAOGTG 
TACTCATTTG 
TTCACAGGTT 
GAGCTGGGAA 
CTCACTTTGl? 
GCAAAGSGCA 
TAGOAOTATQ 
CASOGAGCAA 
CIGCTGAGAG 
ATTGGATGGA 
□TGTAOCGOG 
GGAGTOOAGG 
TACTTTCAGA 
TrOCTGTTTG 
GXGCTGTGCA 
ATTACnaACnr 

GACAACAGOG 
GCrC3«3ATCC 
GTOGTGGTGA 
GTTAGATATQ 



51 
I 

GCCGCCACA^r 
AAAGACCCAC 
ACGCJGGTQGA 
TGTACAOaVT 
GATGOGOCAG 
TGGQAGTCAT 
CTTTGATTGC 
TCQlTCrCTT 
ATGTCTGCTC 
TTGGAOGTGC 
GACTGTTOGG 
GTTTTACTGT 
TTOCCCCACA 
GTTCIATOGA 
AAAACSIAAAA 
CACTGGTGAA 

ooaccrrrcxrr 

GGGATCOCTA 
TTGQATGTTG 
AASTTTTGGT 
GOCXGGGGAC 
GCCTGTTTGQ 
ACCACGGCGA 
TGAGAGGGAA 
TGCTOGGAGG 
TCACASCGTC 
TOQATXAGGT 



Seq ZD UtOt C208 DHA Sequence 
nucleic Acid Accession #£ MM_003273.i 
Oodiag sequence: 25S..2024 

1 11 21 31 41 51 

I 11 I I t 

O6CO6O0GGG CCGQATCCTC CGCGCX3GGCXJ AGTCCATCXC CTGOGAAATQ GoaCOaACAa 
TOTTTC3CTTG ACTGACTATT GTGRGC3GtXX! TCTCTCTCXIG GCGGAGOGGA GACCATOGCC 
CXXACTCAGG CCCCCOQGCC OOGCTGWUVT TGGGAlGaGCC CCrOGOTAAT OGGGCAiQAaA 
QKTGGGAOCr GGGGCAAAGG CXAAlGGGAAG QAGAGCTGGA GOOGGTGAAC TAAGAGCOG6 
GGOGAGATCT QAGGATGGAA OSCTITGQGG 6TGTGGGAGQ Ca^GAGGGACC OSGGGGTTTG 

1329 



60 
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660 
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840 
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960 
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10 

15 

20 

25 

30 

35 
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CAGOGAAOGG 
CJGGQRTTTAT 

CTGAGGACGG 
GGGOGCCOTG 
CTTCCCGCAG 
CTGGCQGCCC 
GAiSOTQCTOT 
GCGCTCTACC 
GAAGCOAATG 
COGCAGTGGC 
GCTTCCAGGC 
CTCTGQGGGC 
TCATCTTCAQ 
CTGOGGGGAA 
QTATCTQTTT 
TCCTChTCAA 
CCaTQTGQCT 
AGGCCGTCCT 
GGGACATC3GC 
OeCftSGQCCT 

GAGTGGCTGG 
QQTaaGOTAT 
CCTTGCOCTG 
TOCTQGTGCA 
GCAGGAGTAC 
CACCACCOCA 
0@ACACAC?TT 
AGAAGAG6TG 



TGTCTGQAGA 
GGTGTCGACT 
GACGCCTOaa 
GOTQCCSGAGG 
CGOGGCGGGQ 

GTTOSQGCCC 
GGAGOCCAOS 
TACTQC03GC 
GGCTCGGC6A 
CGAGGGGCAG 
CCrrGGrlGCTG 
GCTCXXX3C3AA 

cETTcmcrc 

CTCAGGCAAT 
CTTCGACTTC 
OCTGGCCCTG 
GGTCftATGGC 
CACCACCATG 
CTGGGTGCCC 

TG€GGGGAa.T 
GCTTGftCAOC 
GGTCCXSCCAT 

CCGTGAGGCC 

GGOXSGGGCAT 
GOGACTCAAG 
GTTT2V3AGCA. 



GGGAGAOCTG 
GGGAGCAGGA 
AAGGCSAACrO 
CGCACTCTGG 
GGOAGCTGGG 
TCTGCTACTG 
CGOQCGCCTG 
GQCGCTGCTG 
GGaCAAOGlG 
GOGAAAGGAC 
QAATTGAftOG 
ACAGCOCTGT 
ATGCTCCTGC 
TACATGAAGG 
GCOATTTACX; 
AAATATTTCT 
TTGATQAAOS 
TTCCAGTTGC 
GATATCAlSiC 
TTCACCTACA 
ATGGCCICTG 
TCCCAGAAAA 
ATCTCTACAG 
CCCAACTATC 
CACCTGCTGC 
OGGGATQAI9C 
TCCCTTAC50G 

gtgcccactc 

GGCTTGCACC 
AOGAAAAAAA 



AGGAGGGGCC 
GGAGGGTCTT 
GGAGGCAGCG 
GAATGOCGA6 
GG6CTAGGGG 
CTGCTGCCCG 
CTGGGTCCAC 
CTGTGGCTOG 
OGGOOCCTGC 
GCCCC3GGGCC 
ACAAGAGTCG 
TGGTGGGGCT 
OCTTGGOGTT 
OGCAGGIAGC 
ACTTTTTTCT 
GTGAACTGGG 
AGGCAGAGCT 
TCTACtSTGGG 
ATGAOGCjCfiT 
GCdGCAGGC 
TCATCTGCCT 
ACACTTTCCG 
CCACAGGGOS 
TTGGAGACCr 
CCTACTTCTA 
GGA£SIQCCTG 
CATCATC5CCC 
ATOCACCAGC 
CCACCXSVSOC 
TGAAACCAGT 



GGTTCTGG03 
CGAGOGGCCT 
OOGTGCCTGG 
AGGGTCCCGC 
CGGA0G0CX3A 
CCACCATGIT 
CCGOSTCCCT 
CCTGGCTC3QG 
T0G0GGAC3GC 
TTATCAGAOC 
CCXGCGCTAT 
OGGGATGTCA 
TGTOSCCACC 
CCCAGTTTOG 
GGQACGAGAG 
ACOOGGCCTC 
TCGAGGCAGT 
TGATGCOCTC 
TQGCTTCATG 
CCAGTTCCTG 
CATCAATGCT 
AAAGAATCCT 
GAAACTGCIG 
CATCATQQCr 
CCTC3CrCTAC 
CAGAAGTACG 
TACATCTACT 
ACAOXSUSGA 
CTGAGGAT6A 
OACCAAAAAA 



GCIGCAGAAC 
GGGGGCGGGG 
GGGCGGAGGG 
AlOAQACGTCA 
GGIGATGGCC 
CCACXrraCTC 
GCCCX3GGCTG 
CCTGCAGGCG 
TOGaOGBAGG 
CCCCTTOGAC 
OCTATTAAOG 
GCGGGGCTGC 
CTCACOGCTT 
aCCX^TGGCAC 
CTCAACCCTC 
ATCOGCTGGG 
OOCTCACTOa 
TGOCAQGAGG 
CTGGC3GTITQ 
CT0CACCACC 
ACTGGTTACT 
TCTOACCCCA 
GTGTCTGGGT 

TTCACOQCGC 
GCCTGGCCTG 
OAAGCGGCTC 
CCAGGAGOCT 
ACAACCTCAG 
AAAAAAAAAA 



360 

420 

ABO 

540 

€00 

660 

720 

780 

640 

900 

960 

1020 

loao 

X140 

12 DO 

1260 

1320 

1380 

1440 

ISDO 

1560 

1620 

1£80 

1740 

IBOO 

1860 

1920 

1980 

2040 

2100 



Soq ID HO: C209 mA Sequence 

nucleic Ac±d AccesBlon if: KM_015720.1 

CSodlng sequeme: 21.. 1838 



1 11 21 31 41 51 

I I i 1 i I 

CCAGTTCCGG CA0GA6GACC JUPGGQCCGGC TGCTGOGGGC DGCCCGGCTG CC3GCCX3CTGC 60 

TTTCGCCQCT OCTGCTTCTG CTGGTTGQOG Gl«3C3flTTCCT GGGTGOCTGT GTGGCTQQQT 120 

CTGATGAGCC TQGOCCAGAG GQCCTCACCT CC3«:CT0CX:T GCTAOACCrrC CTQCTGCCCR 160 

CTGGCTTGGA GCCACTGGAC TCAGAGGAGC CTAGTGAGAC CATGGGOCTG GGaGCTQGGC 240 

TGGGAGCCOC TGGCTCAGGC TTOSCCafiOG AAGTiGAATGA AGAQTCTOGG ATTCIGCAGC 300 

CACX7U:A£7rA errCTGGGAA GAGGA6GAAC AOCTGAATGA CTCAAGTCTG OACCTGQGAC 360 

CCACTGCAGA TTATGTTTTT OCTGACTTAA CTGAGAAGGC AGGTTCCATr GAAGACACTA 420 

GCCAGGCTCA ACSAGCTGOCA AAOCTCOCCT CIOCC7TGCC CAAGA3XSAAT CTGGTTGACC 480 

CTCCCTGGCA TATGOCTCCC AGAGAOOAGG AAGAAGAGGA AGAGGAAQAG GAGGAGAGGG 540 

AGAAGGAAGA GOXAOAGAAA C3kAGA6GA0G AOGAAGAGBA GOAGCTGCTC CCIGTOAATG 600 

GA3<CCCAAGA AGA3U3CCAAG CXTTCAGGTCC GTQACTTTTC TCTCAOCAIOC AfSCAGGCAGA 660 

CXirCCAGGGGC CACCAAAAGC AGGCATGAAG ACTCOGGOOA CCAGGGC3CA TCAGGTGTGG 720 

AGGTGGAfiAG CAGCATGGGG CC3CAGCTTGC TGCTGCCTTC AGXCMDCCA ACIRCRGTGA 780 

CTCCOaOGQA CCAJSGACTCC ACCAGCCAAG AOQCAGAGGC CACAGTGCIG CCAGCTGC5U3 640 

OGCITGGGGr A6AGTTOGAG OCTCClTClvaG AAGCAAGOGA GGAAQCCACT GCAiGGAGCAG 90O 

CTGGTTTGTC 1>3GCC3VSCAC GAGGAGGTGC OGGdCTSGCC TTCATTCOCT CAAACCACAG 960 

CICGCAGTGO GGCCGAGCAC CCftGAIGAAG ATCXZCCITOG CTCTAlSAAdCC TCAGGCICTT 1020 

OCCCACTiGGC CCCTGGAGAC ATGGAACTGA CACCTTCCTC TGCTACCTTG GOACAAiaAAS 1080 

ATCTCAACCA QCAGCTCCTA GAAGGGCAGG CA0CTGAAGC TCAATCCAGG ATACCCTGGG 1140 

ATTCTACGCA GGTGADCIGC AAOGACIGGA GCAAITCXGGC TOGGAAAAAC TACATCATTC 1200 

TGAACATGAC AGABAACATA GAClGlXSASG T6TTC0GGCA GCACCGGGGG CCAC3U3CTOC 12^0 

TGGGCCtGGT GGAAGAGGTG CTGOCCOQOC ATGGCAGTGG OCACCATGGG GCCTGRKSUCA 1320 

TCTCTCTGAQ CAAGCCCAGC GAGAAGGAGC AGCACCTTCT CATGACACTG GTGOGCGAaC 1380 

AQGGGGTGGT G0C3CACTCAA QATOTOCTTr CCKTGCIQC3G TGACATCCQC AGGAGCCTGG 1440 

A£36AQATTGG CATOCAGAAC TATTCCACAA CX3U3dAGCT6 CCAGGOGOGG GCCAGCCAGG 1500 

TGOGCAGOGA CTAOGGCAOG CTCTTCXSnSG TGCTQSTGQT CXETOKSGGCC ATCTGCATCA 1560 

TCASrCArrGC GCIT G BOCTS CTCTAOIACZ GCTGGCAGOS CX3SGCTGCCC AJUaCIOUUSG 1620 

AOGTGTOGCA CGGOGAGGlUa CTGOGCITCG TGGAGAACGG CTGGCAlGGAC AACCOCACGC l^QO 

TGGACX3TGGG CAGOSACAGC CAGTOBGAGA TGCAGGAGAA GCAOOCCAOC CTGAACQQCG 1740 

QCQGGGCCCT CAAOSQCCCG GSGAGCTGGS GGGGGCTCAT 0GGGGGCAA6 GGGGACCCOQ IBOO 

AGGACCGGGA CGTGTT06A6 GAGQACAOSC AdClGXGAGG GCftGOGAGGC fSCAGGCCGAG 186 Q 

TGOGCOGCCA GGACCAAQOQ AQGTGGACOC GGAAACGGAC 0SaC!CG6aSC OCSBCAOCMC 1920 

CCGSCacXTTA COCOQCCSSCC C5CGGOGOCTG GCCCTOaSCSS GGG6CICCTT COCOCTPCKC 1980 

C3CGACTTCAC AGGGOGGCIT CGQACCAACT OCXaCCRCTCC CGCCCQAGGG GCAGGCCTCA 2040 

AAGCCX33C3Cr TGGCCCCGCT TTOCGGOCCX: TGAACCazOG CXXXX3CXKX3C GGOQGGaSCS 2100 

CTTCCTGOGC OCGOGGACTC AATTAAACCC GCGOGGAGAC CACGCGGGCC CAGGGAAAAA 216Q 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2220 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 2269 

S«q ID NO: C210 DHA Sequence 

Huclelc Acid Acceasion #s im_001197.3 

Cod i ng sequences 61.-843 

1 11 21 31 41 51 

I I I I 1 I 

OACACGAAGC CTOCOGGGTG GCTTACAOAC GCTGOCA6CA TCGOOGCXXBC CAGAGQAQAA 60 

ATGTCTGA2US tTAAGACXSCCT CICCAGAGAC ATCTTGSA'XGG fUSUCCCTCCT GTATGA6CAG 120 



1330 
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CTCCTGCSAAC 
CCTATGGRGG 

TCCC3AGGTGG 
ACaoOATGTTC 
TTCTGGAGAT 
CraCTGCTGC 
TGAOGCCCCG 
GGCQGCCTGC 
CC0G2VQATAe 
TTCCAGTTIT 
AC1GT6GCCT 
AGGGGGA6TG 
GAATAGA.TTC 
AAA 



CCCCGACCAT 
ACTTCGATTC 
TCGOGGACGA 
CCATGCACAG 
TTAGAAOTTT 
CCCCGAACCC 
TGGCGCTQCT 
GCGGCTCAGG 
TGCTGTTATC 
TGCTGOAAjCA 
06TTTTTTCT 
GIGCCCAGGSA 
CTGGTCACAC 
CQAGGAGCAG 



GGAGGTTCTT 
TTTGOAAIGC 
GATGGACXSra 
CCTQGGICrG 
CATGGAOSGT 
COGGTCCTGG 
GCTGCOGCTG 
GCGGOaCTOa 
TTTTTAACTG 
CTGCTGAGGT 
AAAAGAtlSAA 
ASAGCCATTC 
CCCTQIGIGA 
GAlsTGCTCAA 



GGCATGACTG 
ATGGASGG<3^ 
AGCCTCAGGG 
GCTTTCATCT 
TTC3UXACAC 
GTGTCCTGCO 
CTCAQGGGGG 

ccccACCCCx: 

TTTTCTCATG 
TTTATACTCA 
TTCCTATGGC 
ACTCCTGCCC 
TATGTGATGC 
TAAAAT(3TT6 



ACrCT<3AAaA 
GTGAOGCMT 
CCCOGCGCCT 
AGC3ACXAGAC 
TTAAdOIUaAA 
AACRGGTGCT 
GCCTGCACK3* 
ATGACCACTG 
ATGCCTTTTT 
GGTTTTTTGT 
TCDGCAATTG 
CTGCCCACAC 
CCTCGGC3VAA 
GITTGCftGCA 



GCSACCTOGAC 
GGCCCTGCX3G 
GGCCCAGCXC 
TGAGQACKTC 
CATAATQAGO 
GCTGGCGCTG 
aCTQCTCRAa 
CCCIGGAOGT 
ATATTTAAAC 
TTTTTTTTTA 
TCAOCGGTTA 
QQCTUaQTAfSC 
GAATCTACT6 
AAAAAAAAAA 



LBO 
240 
300 
360 
430 
ABO 
540 
600 
660 
720 
700 
B40 
900 
960 
963 



Seq JD 7lK>i C211 TOIA Sequence 
Hacleic Acid Accession tt: AF272357 
Coding eeqaences 83...iOfiD 



1 
I 

GCTGCTCOCG 
GCCCOQGC36G 
G0GGC7GCT8 
CGCCGGCCAC 
GGCAAGOTQT 
CCAGCAAGGG 
caoactoqaa 
ATCAAjCTCOG 
ctcgqcacqs 
CAGSGCGCftC 
GGABCXKOHQ 
OGCOSGTGCA 
CCGCCTOACT 
OGGGATCTOG 



GGCCTOCTOG 
GGCCCCOACC 
GCOCCTGCOG 
CAOCTQCTCC 
AACATGTTTT 
TGCCAGG^G 
GTTCG6TTTA 
GGGTAATTCT 



I 

AOGOGGAGOC 
CGGCGCTGAA 
CGGCTGCTQC 
CCGGATGTAG 
CCTCCTQGTG 
CTCTGTGTGC 
OIVTOAGATTG 
OOCCTACCCA 
eSGCAGGGGC 
ACCTCCCTGG 
aOAGGGOVAG 
GCCGCCCrCT 
CAGAAGGCOG 
CCTGGGGACC 
CnGATGCTGT 
GATGAOGAGA 
GGGGAAATGG 
600C0CTGCT 
CCGACXrrCGA 
GATGCTGTOT 
ACCCCOGAAC 
GACCCCCAAA 
AGAQAC3U^AA 



21 
I 

CGQAGCCOGC 
GGATGGCGAC 
TCTCCGGCCT 
OCJGOCTOTCC 
CACATGCCTG 
CCAGOATQCQ 
ACTTCCTQGC 
AQGACCGTVCA 
TGGAGCTGGG 
<SCTCCCGr6T 
GCGAC30GCCT 
CCGTAQCCTC 
ACTACGCCAC 
AOC3t3GCrGGC 
OOCTGGAGOG 
ATQAQC^CGG 
AG6TGGGCAA 
CACX^GOCTGC 
GGCCCCOGGG 
GCTTTTQGCr 
CTTTGTGCCA 
CTGOAQGaOQ 
GC5CAATEAAA 



31 
1 

GCCGAGOCCC 
QCCQCTQCCT 
OGTCCTCeOC 
OGQGAGCCTG 
TGGGDOCTaC 
CCX3GCCTCCA 
CCAGGAGCTT 
G0S6CTCOCG 



GTCATOOGAC 
CGCCCTTQTQ 
CCrCTGCTGG 
TGCX3AAGQCC 
ACAGAGGGCG 
GCA.TAAAQAG 
AGACTTCAOG 
CCCTCTGTTC 
ACTGOCATGA 
GAGGGGCAQQ 
GGGGCTCGGG 
GGACACCTCC 
CATGGAGAAC 
GTCCATTTCA 



41 
I 

TG<3CCTCGCG 
COGOOCTCCK 
GCCGCCCXGC 
GACTGTGCCC 
CTTC3«30CX:T 
GGOGOGGGCC 
GCCCGGAAGG 
CgAGCCTGCCA 
AGTCCAGOAA 
CCTtSTGCACA 
CrGATGCTGG 
TCJCAQOCTGC 
GCTQGCTCAC 
GAOATarACC 

cx:acccaagg 
gtotacgagt 
gaccaogccg 
cctg6aqqca 

GCCTGGAGCrr 
CTCCAOGCCC 
TSGTCCCCTG 
CGTAGAGCGC 
GAAAAAAAAA 



51 
1 

GTGOCATGCT 
aSGGGCACCT 
GTGGA6C0GC 
TQAAGAGGGG 
TOCAGGAGGA 
OGGCCCA6CC 
AGTCTGSACA 
OCCTGGGCTT 
CCCCCAa3C5C 
TGTCGCCCCT 
CGTTCTGTOT 
AGOGTGAGAT 
CTGCAQCTCC 
ACTACCftGCA 
AGCTGGACAC 
GGCCGGQCXIT 
CACTGTCCOC 
GACAGAC3G0C 
TCCCACTAAA 
TGGGACCOCT 
CACCTCTCCT 
AGGAAOSSfiT 
A 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

760 

840 

BOO 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1371 



9eq TXOs C212 Z>£IA Sequence 

Nucleic Acid Accession ft: KH_004445.1 

Coding sequence i 799..3fll9 

1 11 21 31 41 

I I 1 1 I 

OGGIVGGGGGC GGGGCX3GGCT GG6TT06CTC CAGCC3GOQGC TCTACAGCAG 
GACOOGGGAC CCAQCTTOGC GACGGOQATI CIOGAOSCGG GCCCCC3U3GA 
GCCOCa«KTC TGGAGCAGOC OCTGOOGCCA G03TCROSTC CACCOOSGAA 
TCTC3GOCQOC GAACOQAiCiCC GGGCOGC?XGC AAGGGGGTCX: OCGOACTGGA 
GTGGCACGGT GCGAGCTCCA GGAGCCCOGG GTCCACTGGG AGGCCTCGGG 
CTGCAGAGAC TGCI3GCC3UVC GGGAAGAAAX AAAGGGATTA TAOTCCACJOC 
CITCTGAGAC TCAGACAGGA GGAGAGATAQ AGAA0GGCX31 ATCTCIAGAT 
AOGAGCSTaCC AASCCTGTTr GTClTCATTG TGACACTGGA OTCTAlQATOC 
AAGATCAGG6 TGCCGGCATG GTCAGTTCCT 060SAAGCCT CTCTTCTAOa 
CCCTCTTCTT TGTTGTGTCC TCGAATGGCA GAAAAAOOaG TGGCTGTTGG 
GAGMSTAAAT GAAOAOAAAG AACTQQAAXA ACOCCTTGCA GAAAAAAAAA 
CITAGCrGTA CACOCTGA6T CTIGC3UUWO CTOCIAQCCICC ACCCAGGKOC 
CTGGGGCOAT GGTGGAaGCC CTOAAjSKlGOr GOCATGGCTA CTaAAGGGGC 
GGGAACRSAG TGGOQOGCAT GGTGTGTAGC CTATGGGTGC TGCTCCTC3GT 
CTGGCTCTGG AAQMOTATT GCTGGACACC ACCGGAGAGA CATCTGAGAT 
ACCTACCCAC CIAGGGGGGTG GGAOGAGOTG AGTOTTCrGG AOSAOCAOOG 
OQGAGCZTTG AGGCATGTCS^. TQTGdCtUSGG GCCGCTCdVO OCACOSGGCA 
TOTGCAGACAC ACTlrTGl^GeA GGGG0G0G66 GOCCAQAGGQ OQGACAITGe 
TCTGTGCQC^ CATGCTCCAG CCXGGGTGTG AGCGGCGGCA CCTGCCQOQA 
CTTTACTACC GTC3U3GCTGA GGAtSOOCGAC AGOCCTOACA QCGTTTCCTC 
AAAG6GTGGA CCAAGGTGGA CACAATTGCA GCA6A0GAGA GCTTTCCCTC 
TCCTCCTOCr CTTCTTOCTC TGCAGCGTGG GCTGTGGQAC CCCftCGGOGC 
QCTQGACTGC AACTGAAC3GT CAAASAGCGQ AfiCTTTGGGC CTCTCACCCA 
TACeiQGCCT TCCAGGACAC GQGGGCCTGC CTGGCCCTGO TCGClXSTCAG 
TACACCTGCC CTGOOGTGCT CXX3ATCCTTT GCTTCCTTTC CAGW3AOQCA 
GCTGGGGGGa CCTCOCTGCr GGCASCTGTG GGCAOCIGTQ 1X3GCTCAT6C 
GAOGATGGAG TAGOOOGOCA GGCAaGaaQC AfiCCOOCOCA 
GGCAAGOXSGA IGGTAGCTGT CGGGGGCTGC OQCT6CCAOC 
QGAGACAAGG CCTGCCAAGC CTGCCCACCG GGGCTCTATA 
CCCIQCTCAC CATGCCCTGC OCGCAGTCAC 6CTOCCAACC 
TGGCIGGAOQ GCTTCTAOCXS GGCC3^TTGC GACOCSbCCAG 
OCAITOQGCTC GCCAGGAGCT TTG6TTTGM GTGGAAGGCr 



CTGGATACCA 
AQTCTTCTQC 
CAGCAGOOCC 
AGGCCCCCTQ 
CAGCRCTCAT 



51 
I 

GGGGCQGCQG 
XTCTOCOGGC 
TCC€»iGGGAC 
GAAGACGG(3G 
OGGCTCAGAC 
AATTC&CAGA 
CAACAAGCAA 
TGGGAAGTOC 
TTTCAGACTG 
A6GAAG0OAG 
AAAAGGGAAG 
AGOGTOanoG 
TGGGCACTTA 
GTCTTCAGTT 
TCGCTQGCTC 
AOGCCTGACr 
GGACAAT1G6 
ACICCACTTC 
GACCZTCACC 
GTGGCftCCTC 
CTCCTCCTCC 
TGQGCnGGGQ 
AGQCGGCrrC 
GCTCTTCTCX? 
GOOCAGTGGG 
AGAGOCAQAG 
CAACQGGGAIS 
AOCAGCACX3A 
TGGGAATGCT 
CGTTTOeOQC 
CACTGGTOCr 
GCTACACTGG 



60 

120 

160 

240 

300 

360 

430 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

ISOO 

1560 

1620 

1680 

1740 

1800 

1860 

1920 
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5 
10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



CGCCTGCCTC 
TGTOAAGGCC 
GAG6TCCACT 
CTCOGGGCAC 
AGCCCTGAOC 



CAGCCCGACC 

CTQAGCCCra 
CCCTAC6GG6 
CCGOAAAQAC 
GCAGCCATCA 

ACCXACGAGG 
ATCAAGftTTg 
CAdGCCACGGG 
GAAAaCCTOC 
AAOITGCTGC 
TTCKTGGAGC 
CTSCAGCTGG 



JUU3GTGGCCC 
CCAGAGOTCA 
CrC&TGTGGG 
GTACTAAATQ 
TTACATCTAC 
CAOCTGQTQG 
GAGCCA6GGG 
CTGlSaCTCAC 
TCCAACSTTTG 
GGCCTGGC3CA 
C3U3CAACACC 
AQCOCTOQAC 
TOTGAGAGAT 
CGTCTCCaCC 



GGGnCCnXSQQ 

TCX3AOCCTCG 
AOOTACCCTTA 
CDCCTOUSGC 
TGGTGCACCA. 
ACACCAATGG 
CCCACTCCTT 
aCCACATCTA 
GCAAAGTCTA 
TCTCCTTGQr 
COCTGCTGGC 
AGCAATACAG 

accxx:t6tca 

AQQAGQTCAT 
GACGGAOCGA 
AOATGACCTT 
GGCIGGAG6G 
TTQQCQCCCT 
TGGCCAXGCA 
ATCGCTCI3CT 
GXCTTGGCCA 
TTGCA<3.TG6 
AAGTGATGAG 
CRKTHaRllSCfi 
TTATGTTGGA 
CTGCATTTGA 
AAAGGQCTTC 
CQC3U3GCCrS 
GOCTCTQTAC 
TCAiOCCTGGC 
TGnCTGCAlSCA 
ACIGGTCGGA 
GOCCCACACC 
AGCCCCCTCC 



GG6TCGAGGQ 
TGCCZAGCGGT 
CCAGAGAGGC 
CATCTTAdSZU? 
TGCAGGCATC 
GSTQAGCGGG 
GAACATCCTO 
CRCCCTaA£3C 
TGQirrCCAG 
TTTCCAQACA 
C3ATCQGCTCC 
GGTCGTCTTC 
CftSCXiXAGGA 
GGCCATCCGA 



GCftGACTGTCJ 
CCTGGGCCGG 
QGT6GTCAOC 
GGACAGCTTC 
GCGGGGAGTO 
QTCTQCCCnC 
CAGTCCTCAG 
AAAGCATACA 
TTAXGGAGAA 
GGMSTTC06G 

CAAGATGATC 
CCAGGCCCTT 
GCTTTCAGOC 
CITC3U3TGAT 
TGGCCACCAO 
GGGCrCAGTG 
GAAGGGACAT 
AAACCCAACG 
TCATTAAAGQ 



GAOCTGCTCr 
GGTGGGGGCA 
CTGACTGAGA 
GTGCAlGGCXG 
AATGTCRGCA 
GCATCCAACA 
OACTATCAGC 
AGOGAGACCA 
GTGOGQOCCC 
CTTCCTCAAG 

C22U3CGGAAGC 
CTCGGGGTGA 
GAACTTGOCC 
TCTTTTGGAG 
GCXavrCCRGG 
GCCGCAGTGC 
AAQAGCCGAC 
CICAGCX»aC 



AGCGTGCTGG 
QGCCCAAGTT 
ACATCCAGTG 
OOGCSCTTACr 



AAOGACGGTG 
CGCAAGCCAQ 
CTGACCCCTG 
Ai-i'iJGACTGG 
6TGGCTCRGC 
AAQAAGCTGC 
GAGGTCTGAa 
GTGOaACGTG 
CTOCOGATGG 
QAAAGAA0GG 



TCAATGTCGT 
CTTGTCACCG 
GCCSAOTGTT 
TXAATOGG6T 
CCAGCCATQA 
GCATCACGGT 
TCOGCTACTA 
ACACTGOCAC 
GGACTGCTGC 
GG6AGCTGTC 
CTTTQGCCTT 
GGOGTGGGAC 
AGTATTACAT 
GGGAAGTCGA 
AAGTGCGCCA 
CCCTGTQGGC 
TGOBTCAGTT 
CCCTCATGGT 
GGOAGGaCCA 
TGCMSTACCT 
TGAATAGC3CA 
GTTXGCTTCG 
ATGTCTGGAG 
GGGACATGAG 
CTCCAG6CT6 
CCCGGCX3GCC 
ATACCCTGCA 
TOeOCCTOOA 
AGTGCIACCA 
TCAGCCTAGA 
TGC2U:CACAT 
AATGAOQATA 
AGCCGOSCTC 
CTGCATTCOC 
AATTTGGAAA 



GTGCAAGGAG 
CTGCAQGGAT 
AGTGGGGOGA 
GTCTGAGCTC 
ASTGCCCTCT 
6T0CTGGCCG 
TGACXaGGCA 
CGTGACACAG 
OGGCCAGGGC 
TTCCCAGCTT 
CCTCCTGCTG 
TGGCTAGACG 
OGACCCCrOC 
TOCTUCTTAT 
OGGCCGCCTG 
OGGGQGCXICC 
CCAGCACCCC 
GCTQACaaAQ 
6TTCZU0CAGC 
GTOCAGCTTT 
CTTQGTGTGC 
CTGGGCAGCC 
CTTTGGGATA 
TGAGCAG0AG 
TCCTGCTGGA 
TCATTTTQAC 
GGCTGGOGGG 
CTTTOCTTGT 
GGACUCITC 
AGACCXGCCT 
CCAGCTCCTT 
CCCJGTGACTC 
CAACAGCCTC 
TQGTCCrCCG 



Seq ID VIO: C211 SNA Sequeixcc 

NucXejlc Acid Accesslcxa fti XM_04S340.4 

Coding seguences 195.. 1067 



1 
I 

GGGOSGCGCC 
CTGGOSAGTG 

COGS06GCGG 
AOOACCTTAC 
TOCTCTTCCT 
TCXATGTCOA 
UJriTACCCAA 
AOGAAGXGGG 
GCTTOGAGGG 
GOfGCCACAGG 
OCSGAOiCSQCI 
GACTCAGCTC 
ATGAGQACAA 
AlCaTOGOCTA 
GCOCCATCAC 
TCtGTGCCAT 

cascctciGA 

CTAATGOOOG 
OCICAATCTG 
rAGAQOATOS 
TACACXGCCT 
ASGSGIGGGG 
TGGACAATCT 
AGCCAACCTG 
AI3GT»CTCCT 
TKTGAZITGT 
CATTTTTTT6 
TGOMUSAOAC 
TTAGTTCTTT 



11 
I 

CAAT6GGCTG 



CTCGGACCCA 
CAOGATGCGC 
GCAGCX^AACG 
CTTCCTCTOG 
TGACCCftGAC 
TCTQCACTQC 
CCACATCGAC 
GCAGTTCABC 
OCA6GCACAG 

acnaorcciAO 

CAAOCCCCTO 
GAGTGGCAAG 
CAGCCACAOG 
GGTCAAGTAC 



GGGCVGGAAG 
AGGACCCTQO 
GTCGCAAATC 
CaQQATGTTr 



AGTOCAGGGG 
OGCAGCCOCC 
CCCCAGAGGC 
TTATACACCC 



AaSAAAAA2A 
AGCTCTBAGC 
CCCCACTGGT 



21 

i 

GGOGGAGCGrr 
OGGGGGCGGC 
OGC96GO60OG 
TTTGACTTCA 
1-AC^GCGGGG 
GAGCTCACCG 
AAGGftCAGCG 
QAGVTGGTTG 
AACTOChTGA 
A7CAACAAGG 
AAOCCAGACA 
AAGAXCCACG 
GCCTCCCACS 
CAGOGGIACT 
GGCCGCATCa 
AGAGnSAGAC 
ACCTTCAGQS 

am3atccagc 
ocatoqccag 

TGGCTGTGTC 
TGCAGCSCAGC 
TTOCCCTTTC 
MUCMtCXlCHG 
ACTAGGTGQA 
ACXaSACCTG 
GGCACCTTCC 
GGAAAXQTGT 
AAAAAAGAAC 
UUTTATTCTT 
XGG6TTCAAC 



31 
1 

CACTTCCXXK3 
G6GGOGSGGC 
OGGOOOGCCr 
GGAGGTTTGA 
CCATTATCTC 
GAXX7ATAAC 
GTGGCAACiAT 
GGCTTGACAT 
AQATCCCGCr 

tccooggcAa 
ti3aogcatgx 

GAGCTTTCAA 
ACrAOlTGCr 
CCTAOCAGTA 

06CA00CGCT 



TGGGCAA<»T 
C3CTT6OCT0C 
CC3UVAGGGTG 
TCTTTTOCCC 
CClGGGGhGC 
AATGOUTATC 
CACaITAATG 
ggctttcsigc 
AOGAAAGAIG 
GGGAXCCACSL 
TCTGAOQACA 
ATAATCAAAA 
TABATTAAAA 



41 
I 

CAGCGGGAGG 

OQCCIGCAGC 
GATCTACAG6 
CATCTGCTGC 
GACAGAAGTT 
OOnCGTCAGT 
TCAGGA7GAO 
GAAChA^TGGG 
CTTCCAOGTG 
CATCCACAAG 
T9CTCTGGGQ 
GAAGATTGTG 
CAC3GGTOGCC 
CTGGTTOOGC 
QXACAGATTC 
CCTGGACT^ 
GCATTGACGC 
AGTGGCCTGT 
TGIGGGAAGT 
OTOTTTCTTT 



OGAGTGGOGA 
GAQGiaGGGGT 
GCTCOCRCCC 
AAGGTGOCCA 
IGOCrCTTCA 
GXGAAGGfiGC 
CT6AAC&TCA 
ATGGGC3U3GC 
GCAGQCrOCC 
TCCACACACA 
CTCrCCTTTG 



QATCAGCTCT 
ATTTGGTTTC 
TTTGGQACCA 
GTACTTGCCA 
TGAOTTQAAA 



TCTGAAGOGA 
GGCTGATTTr 



CCCACGGTTT 
AAC3IAGGAAT 
TAOGAOCrCA 
ATCACCAOGA 
TGCATCn^ 
CACAOCCAGC 
CTCCTTTGGG 
GGGGGGAAAG 
TTAGACAAAI 
GASrCTWGGCA 
CASOCAGGCT 
TOCCCTGOGC 
GGCT6GOCAA 
AGCAAGCCQC 
7X'Ci'TTXfl!£iv 
CnSTTTGCIA. 
AAAAAAIGTO! 
CAG 



1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2-700 
2760 
2820 
2860 
2940 
2000 
30€0 
9120 
31B0 
3240 
3300 
3360 
3420 
34B0 
3540 
3600 
3660 
3720 
37B0 
3640 
3900 
3960 
4010 



Seq ZD NOt C214 TOXti Sequence 
ISuclelc Acid Accession #: BMjDOaiSl.l 
Cbdlng sequencei 246.. 1499 

1 11 21 31 41 51 

I I I I 1 I 

TC3SAGCCC6C rCTCCSUSGGA CCXZXACCTOA CaSGGGC3UZAG GTGAGGCAGC CTGGOCTAGC 
AGSDCC3CaOG CCADOTDCTC TGCCTCCAGG COGOOOGCTG CTGCQOOGCC ADCATGCTCC 
TGCCCAGGCC rCGAGRCTGA CCCQACCCCO GCACTACCTC GAGGCTCOGC CCXXaCCTQC 
TGOACCCCAa GGTCQCACCC TGGCCCAGGA GGTCAGOCAG GGRATCAZtA ACAAGAG6CA 
GTGACATGGC GCAGAAOOAG GGTGGCOSGA CXGTGCCAT6 CTGGTCCIiaA CCCAAGGTGG 

1332 



60 

120 

180 

240 

300 

360 

420 

480 

540 

eoo 

660 

720 

780 

840 

900 

960 

1020 

lOBO 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

162D 

1680 

1740 

1793 



60 

120 

ISO 

240 

300 
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CRGCrCTCAC TGCGGSGRCC CTGCrACTTC TCACAGCCAT CGG06CX3GCA. TCCTGGGOCA 360 

TTGTGGCTGT TCTCCTCAGG AGTQACCAGG AGCCGCIOTA CCCAGTGCAO OTCRGCTCTQ 420 

OGGACX3CTCG GCTCATGCjTC TTTGACAAGA OGGAAGGGAC 6TGG0GGCK3 CTGTGCTOCT 480 

. CGGGCICX:»A OGCCAOGOTA eCCGGAGTCA GCTGOSUaQA GATOGGCITC CTCAGGGCAC 540 

D TOACCCACTC CGAOCTGGAC 6TG0GAAC06 0G6GCGCCAA TCGCAOSTGO GGCTTCTTCr 600 

GTGTtMACBA GGGGAGGCTCJ CCCCACACCC AGAGGCTGCT GQAXSSTCATC TCCGTGTGTG 660 

ATTGCCCCAG AGGCXOTTTC TTGaroacCA TCTGCCAAGA CTGTGGCCGC AGC3AAGCTGC 720 

CCGTGGAC03 CATCGTC3QGA GGCCGGGACA CCAGCTTGQa COGGTGGCXZJG TGQCAAGTCA 780 

^ ^ GOCITCGCIA TQATOGftGCR CAfXTCTQTO GGGGATOCCT GC3:CrCGGGQ GACIQGGTGC 840 

lU ■FGACAGCC9GC CCACIQCTTC CCGGAGCGGA AGCGGGTCCT <3TCCOC3ATGG CGAGTGTTTG 900 

COGGTGCCQT GGCCXaK3GC!C TCTCOCX^OG GTCTGCAGCT QQGGQTGCAG GCTGTGGTCT 9^0 

ACCACGGGGG CTATCTTCCX: TTTCGGGACC CCAACAQCQA GGAGAACAGC AACQATATTG 1020 

CCCIGGTCCA CCTCTCCftGT CCCCTGCOCC TCACRGAATA CATTCXaGCXlT GTGTGCCTOC 1080 

^ « C&GCTGCCGG CCAGGCCXTTG GTGGATGGCA A8ATCTGTAC CGTGAOSGGC TGGGOC3UVCA 1140 

I J 0GCM3TACTA TGGOCAACM3 GCOOgGGTAC TCCAOGAGGC TGGAGTCCCX; ATAATC3W3CA 1200 

ATQATGTCT6 CAATOQCGCT GACTTCTATG OAAACCAQAT CAAGCOCAAG ATGITCTOTG 12^0 

CTGGCTACCX: CGAGGGTGGC ATTQATGCCT GCCAGGGCGA CAGCJQGTGGT CCCTXTGTGT 1320 

GTBA GGACAG CATCTCTOGG ACGOCAOGTT GQaSQCTQTG TOGCATTOTG AGTTGGQOCA 1380 

CTGGCTQTGC GC^rCGGOCAG A2^iGCCAlGG)Ca TCTAChCCAA ABICAGIGAC TTCCGGGS^ 1440 

ZU GOATCTTCGA GGCCATAAAG ACTCACTCC30 AMKXAGCGG CATGGTGACC CAGCFCTGAC 1500 

CGGTGGCTTC TCGCTGOGCA QCCTCCAGGG CCOGAGGTGA TCCC3SGTGGT GGGATOC3V0G 1560 

CTOGGCCGAG GATGOOACGT TTTTCTTCTT QQOCXXGGTC CACAQGTCCA AGGACACCCT 1620 

cccrccRaoG tcctctcttc cacaqtigqcg ggcocactca GarooGAcsAc csioccaacct isao 

CACCCTGCT6 ACCCCCATGT AAKTATTGTT CK3C3GTCTB QGACTCCTGT CTAGGTGCCC 1740 

Z J CTGATGATQQ QATGCTCTTT AAATAATAAA GATGGTTTT6 ATT 1783 



30 



Seq ID NO I C215 DMA Sequence 
Muclelc Acid Accesfilon AB037745.1 
Codlsig sequences 26-. 1744 



1 11 21 31 41 51 

1 I t I ! 1 

ATGGTGGAAC ACGCTGCCCA CAAACATGGA AACOACCGTT CTCAGTGGGA TCRACTTCGA 60 

GTACAAGGGC AT6ACAGQCT GGQftBG'PGGC TGGTGATCZU:: ArETTACftCSUS CTGCTGOAfiC 120 

CTCAGACAA*r QACTTCaUTGA TTCTCACXCT GGTTGTGCCA OGATITAGAC CTGOQCAGTC 180 

GG^ATGGCA GACACAGftQA ATAAAGAGGT GGOOVSAATC ACATTTGTCT TTGAGAOCCT 240 

CTGTTCTGTG AACTGTQAGC TCTACTTCAT OaTGQaTGTG AATTCTAGGA CX31ACACTCC 300 

TGTGCawaACJG TGGATIAOGTT CCftAJW3GCAA ACAGTCCEAT AGCCACavrCA TTGftGGAGAA 3fiO 

CACTACGAGG AGCTTCAOCT GGGOCITCCA GnSGACCACT TTTCftTGAGG CA&iaCAaGAA 420 

W GTACiAGCAAT G9U3STTGCCA AGATCTACTC CATCAATGTC ACCAATGTTA TGAAT6GGGT 480 

GCCCTCCTAC TGCCGTCCXTT GTGCOCTAGA AGCCTCTOAT GTGGGCTCCT CCTGCACCTC 540 

TT6TCXOTCT OGTTACTATA TTQACCGAGA TTCTiGGAACC TGCCftCTCCr GOOCCCCTAA 60O 

CACAATTCIG AAAGCCCAOC A£3CCrTA<3X30 TQTCCAC3G0C TGT6IGCCCT GTGGTCC3U36 €C0 

. GACXSU0AAC AACAAGMTGC ACTCTCTGTG CTACAAXGAT TtKACdTCT CACGCAACAC 720 

TOCSiaCCAGO ACTTTCRACr ACAACTTCTC OGCTTTGGCA AACACCQTCA CTCrXGCIGG 780 

ACSOGCCAAGC TTCACTTCCA AAGGGTTGAA ATACTTCCAT CACTTTACCC TCAGXCfTCTG 840 

TGGAAACCAQ GGTAGGAAAA TGTCTQTGTG CACGGACTU^T GTC^CTGACC TCaSGAITCC 900 

l^AflGQTGAG TCAGGSITCT CCAAATCTAT CAC2U30CIAC GTCTQCX»GG CAGTCATCaVT 960 

CCOGCCAGAG G TOAC ABGCT ACAACSGCCOG QGTTTCCPCA CAGCCXGXCA dCCinSCXGA 1020 

jU TOSACTTATT GOGGTGUyCRA CAGATATGRC TCTGGA!rG8A ATCACCTOCC CAGCIGAACT 1080 

TTTOCftCCTG GAOTOCXTGG GAATACCGGA OaTSATCXTC TTTTATAGGT CCAATOATGT 1140 

GACCCRQTCC TGCAGTTCTQ QGAQATCAAC CACCMOOGC GTCAOGTOCA GTCCACAGAA 1200 

AACTGTCCCT GGAAGiTTTGC TGCTGGCAGG AACGTGCTCA GATOGGAOCT &EGAZdaCTG 1260 

„ CAAGTXCCAC TTCCTGTGaS AOftGOGCGGC TGCITCCXXS CTClGClCAQ TOGOMACIA 1320 

CX3VTGCTATC GTCAGCAGCT OTGTGQCTGG GATOdUSAAO ACTACITAOG TGllGGGGAGA 1380 

ACCCSUiGCTA TGCTCTQGT6 GGATTTCTCT GCCTGAGCAG AGTUSTCSvaCA TdGCAAAAC 1440 

CATAGATTTC TrGGCTOAAAG TGGGCATCTC TGCAQQCACC TGTACTGOCA XCCTQCTCAC 1500 

06TCTTGACC TGCTACITXT GGAAAAAiGAA TCAAAAACTA GAGTAC3UiaT ACTOCAASCT I5fi0 

BGTGATGAAT GCTACTCTCA AOGACTGTGA GCIGCCnSeSk OCTGIACASCr QCOCCSaCKE 1C2Q 

OU GGAAGGGGAO QATGTAGAGe AOGACCICAT CITTAOCAGC AAGAZUSl^CAC TCETTGGGAA 1680 

GASfCAAATCA TTTAOCTCCA AGCftOOCT^GC TCCTGTCStfK! ATCICTCXTT CAGAG6ACXC 1740 

CTQATGGATT TGACTCAGTG COGCTGAAGA CATCdXTU^O AGGGCCAGAC ATGGAOCT6T IBOO 

GAGAGOCACT GCCTGCCTCA CCTQCX3CCT CACCTTGCAT AGCIICCTTTG CAAGCCTGCM IfifiO 

GGGATTTGGG TGCCAQCATC CTSC&hCACC CACXGCTOSA AATCTCTTGA TTGTGOCCTT 1920 

OD ATCAGAT8IT TGAATTTCAG ATCTTTTTTT ATAjQAGTACC CAAACCCTCC TTTClg C TTG 1960 

C3CTCAAACCr 6CCAAATATA CCCACACTTT GTTTGTAAAT TATGCCCTTG CTTGTATCTT 2040 

QTTTCWCAAA ATGGOOCATC CGCCftOAGCC ATAGCTTCGT CTGCTCATAA TTCTTATAGC 2100 

TTTGGAATGA AAATATTTCT ATCTTCTTAA GTATAGAAAC TATTTCCTCT GXCCSCTAAC 2160 

TOTTAAGea(M AAACAGCTGG eAGTTTTOCT CGCMCGOOCT GAGCTCaTOA TCTCITCAG6 2220 

AGAGAGGCIia GGTGAGGAG6 GTGTCGGGOT TCC C IGgPeG ATAATCITCA TAGCAOCCTG 2280 

CSA^rCCATTTC CCCTOGATAA CCAGCTCAAA GGGAGTGAAA ATOGTAGTCT GAGOGCAAGG 3340 

QGAGCMiGGC CTGGGTAAOA AAAGCXJTTSA AAAGCATAAA AAGAGGCOGG GOGOGOTOGC 2400 

TCAGGCCIGX AATCXXAGCA CTTTGGGAGG GCGAGGC36GO CAQATCATQA G6IG0GGAGA 2460 

TUGAGAiQCSkT OCT O QCTAAC AOGGTGAAOC CCOGICTCTA GCOSRMOhC AAAAAATTAG 2520 

/D CCGGGCGIGG TGGOGGGTGC CTGTQGTCCC AGCTACTCXSG QAGGCTGAGG OGGGAGAATA 2580 

GC3GTGQGC3Cr GGAA6G00QA GCTTGCAara AGCCGAGATC GOGGCACtGC ACTCCATCCA 2 640 

aCCTGGGTGA CAGA6TGAGA CICTSCX^TCA AAAAAAAAAA AAAAAAAGAA AAGCACAAAO 2700 

AGAGGCAACA AGSAATGTTT TTGTnTTGA GACRGGCTCT CACICTGTCA CCTAOGCTGG 2760 

AGTGCnGIGG GGTAATCACT GTTCatGTGCA GCCTCAAGCT CI!EGGGCTCA OGC1A1GCIC 282 O 

OU CCATCTCflfiC CTCTCAAGXA GCTGGC^CTA CGA01!araCA CCACCAGGCT GACEAATTTT 2880 

TOTGITTTTT GTAGACACGG GGTTITCACXXS TGTXGCCCAG GCTGGTCXCC AACTCCTGGG 2940 

CICAAGTGAT CTGTOaSCCr CX3aCX:TOCCA AACTGCTOaa ATTACAGGCA TAAGCCACTG 3000 

CRCTCAGCCT TTTATTTGTT TTTTAAAOCA aSTAGCTCAT TGCCTTCTCT TAAQTAAAM 3060 

ATAGATAITC TC3U:TGAAGC CAAAGGAATA AaTTCnTCAA GAAAAIGCCC AAAGCCCXGQ 3120 

1333 



wo 03/042661 



PCT/US02/36810 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



TGGATACATC 
QAACCAQQTA 
6ACATAGCAA 
GTTT06QCCA. 
TTTrAftTGGG 
TGTCTJSCTTC 
AAAAGGAACT 
GOAAGGCTAG 
AATCCC3UU3a 
TOAGACCCTS 
CTCACTATTA 
AACTAAAG?^ 
GGTQTCTTCT 
TACXTTAGAAT 
CAGTTCTGOC 
CCATGATTTG 
GAAAQQOfShT 
TGCTGCCCTG 
OGCCATCAAC 
TTTGG6TATC 
TQ^TGGAACT 
TTCTGTAGCC 
CAAAGGCCAC 
TTGACTTCTT 
ACXACCCCIT 
GTTACGCTAC 
ACAGCTTTAT 
AT GGTTTT CA 

CCA^ATCTT 
CCTCGCCAGT 
TAaTJ«3AGAT 
AT?CCACiCCAC 
CTAOGTGnC 
TGTCCAAjQCT 
TTBAGTCXnXS 
ATACAAATGA 
QTGCCAACAA 
AACAGftAACT 
CTgqTTG OA T 

TATTGCArror 



CTCCCTATCT 
AGCCCCAAAC 

AcccrarcAa 

TATTATGGAA 
GCACTTTAGG 
TAAGCXrAAC 
<3Aft.TCX3CAGG 
GAGAGTTTAA 
CAQAATOGTG 
TCTCTAAAAA 
AGOCCCTATC 
ACTTCTGQTC 
ATCOCCATTT 
GG6TAAAACC 
TCCATCACTA 
TGCCTQTGCC 
GGCATTTGCA 
GCAATGGCTT 
AOTGTTGGTG 
CAAATTCTCC 
TCCC0CAJ3aX3 
TCTIGCAATAC 
AGACAGCCX:! 
TATTClGAiCl 
ACOGTGCTGA 

TOAGATATAA 
GCAAACTCAC 

GGCTCACTGC 
AGCTOOQATT 
GGGGTTTCAC 

CToracxrrcc 
AATTTTCTAT 



GAACACGCAG 
AQATTQTOGT 
AACTTTTTTT 
TTBTAAXAOA 
TTOCTTTTTA 
Q^CTTTTTTT 



TTTTTTTAAA 
CCAGATGTTC 
TQAQGAAAAT 
CM3GGC3ICTC 

ggaiacagcc 
gcatctgctc 
ccx:ataogcc 

CAAGATTTTC 
GCTTSaGCTC 
ATTTAAAAAT 
TTTCTCTTTT 
OVATTTCTGQ 
TCCCCAATTT 
AOAQCAAGAC 
TCA0GCTAG6 
CTTTCTCCAO 
AGTGACTAGT 
TDTAAGAQTC 
ACGGCAGGGA 
CCTCCTTOTQ 
TGGATGCAGA 
TTGCGCACCT 
TAGACTATTC 
OCACTOftXTT 
CTTCTGCAG6 
TGTTGCTCCC 
TTCACATATT 
AGAGriGTCC 
QAQACft0AQT 
AGOCICCOCA 
ACAQGCATGC 
0GTGTTG6CC 
CAAAGT6CTG 
GAACAAAQGC 
GAGGGAGTAT 
TAACTGTCar 
OTATTCAAGC 
TGTCTGACTA 
TTTTTTCAGC 
CCCT6XAATC 
TTTTTGTAAA 



cx:ttccacta 
cagccttatc 
tccccatcct 

TTATTT6AAA 
CACAAIGGGA 
AGGCACA6AA 
AQCACCAGAA 
ACTGGQCCCA 
AOGAGTTCAA 
AAACAAQGTG 
TTTCATTCTC 
CAACATCCCT 
TACaCAAACT 
TTTAAATTAC 
6TGACCITCC 
"TOACCATTTG 
CTGCCACAAA 
AATGAGAACT 
GTOCdTTGG 
TAGGAGTCAG 
TACGCRSCTC 
QCTGTCTCAC 
CGGAAACAGT 



TCTGCCCTGT 
CTAGC3UU3TT 
ATACAATTCA 
GCGCACTTGA 
CM3CTTTGTC 



accAOCAcoc 

AGGCTGGTCT 
GGATTGCAGG 
TTTAGTCCTT 
QATAAAATGT 
GCIATAGTCSA 
AGTAOGGTTT 
CATTAAAGAO: 
Tm^GTGAAAT 
CAAGCCnrAA 
ATAAAAACAT 



TCACTCTATG 
CTClATTGGG 
TOAGTGCCOC 
AGAGCACAAG 
TGGGCCTGAG 
TAftACGTCTA 
TCAAACCAGT 
GCATGGTGGC 
GACCAGCCXG 
TTCAOCAAGC 

aattgctttg 
tctgaaag6t 
attatcaatg 

CITGGTCAAG 
GTOACCAGAT 
ATGCTCRTCT 
AGAGCCAGGC 
TTTAATAAAT 
GCTCTCAGAA 
CTGAQCTCCA 
TGAATAGCTT 
AGGAAAAATT 
AC7TTAAGGA 
GACCTGXCAG 
AGGCATGTCA 
CCTTTAAAAC 
QAGCAAACAC 
GCCCAGGCTG 
AAGTGATCCT 
CTAGCTAATT 
CAAACTOCTG 
TGTGAGCCAC 



TTAAATCTCA 
TCATCTGTAT 
TTGCTTTTGT 
AAGACTQACT 
OGAATTTTTT 
TAQfTTTOITA 



ACACTGAAAA 
TTTACCCACA 
CGTCCTAGAA 
GAGGCCAAGA 
GTOGCCGTGA 
GGCTGGCCAA 
CTTCAAGGAA 
TCACACCTGT 
GGCAACACAG 
TGGGATACTT 
TGTQATAAAA 
GA6TAGAGT6 
AACTTTTAAG 
CTTCTACTGG 
CCCCAATTOC 
GGTAGATATA 
OATTAQCCAC 
TGTGGTCOCT 
0(ZAGTTTTTC 
OCXG'I'G'i'CCA 
QCCTAAAGTC 



ACATATGTCT 
GCTACITTTT 
QAACTCCTOA 
TATATTTTTA 
ATACGATTOl 
ATGTTCAATT 
GAGTGCAGTG 
TCTGCTTCAG 
TTTGTGTTTT 
QACTCAAGTa 

OGTOccreec 

TAAA6TGGTC 
TTTGGTTACC 
TTGGCTGGGA 
TTTTGTTTTA 
ATATTTATAC 
TTCATCAGGG 
GAAGATGGGT 



8eq ZD HO& C216 ENA Sequence 
Mucleic Add Accession #« llM_004d£4.1 
Coding geq y ences 26^-952 



1 
I 

OGGAACGAGG 
TC3U3ATGCTC 

OGOCGAGGCG 
ATTOGGAC3AG 

AGTOOaGCTG 
GGGGCTOCCC 
AAGGTCGTGG 
GCCOGCGCTG 
ftlCTICGTOC: 
CCGGAGAGGS 
TCTGCACACXS 
AC3BGGAGGTG 
CAT6CA03GG 
ClGCTGCaTrG 
GTOGCTOCAiG 
GGTCCITCCA 
GGGCTCAAGG 
TTATTTATTA 
ACTGTGTATT 
AAAA 



11 

I 

GCAACCTGCA 
CTG6TGTTGC 
AQCCGC0CAA 
TTGOGGAAAC 
TCQAACACCQ 
GGATOOGG06 



GACQTGACAC 
CACCIGCOAC 
QCAOQGOOOC 
OGTGOGOGCA 



CAAGIGACCA 
CAGATC^AAGA 
COCGCCAGCT 

CTGTGCACCT 
TTCCTGAGAC 
TTAATTTAIT 
TATTTAAAAC 



21 
I 

CAGOCATGCC 
TGGTGCTCIC 
GTTTCOOGGG 
GCTAOGAG6A 
ACCTCGTCCC 
GCCACCTGCA 
GGCTTCAOC3G 
GAOCGCTGOG 
TOTGGCCOCC 
AGCTGGAGTT 



OBCTGGAAGA 
TGIIGCATOGG 
CQA0CCTGCA 
ACAATCCCAT 
AtTTTGTTAQC 
GCGCGGGGGA 
ACCJCQATCTCX: 
GGGGTGACCX" 
TCTOGTGATA 



31 
I 

CGSGCAAGAA 
CRDGGCIiGCCG 
ACCCTCAGAG 
CCrOCTAACC 
GGCGCCT6CA 
OCIGGGECATC 
QQCTCTGTTC 
GGGTCAGGTC 
GCOGTOGCAG 
CCSAC1TC03G 
CIGTC9C3GCrC 
CCTGGGCTG6 
CGCGTGCCCG 
OOGOCTGAAS 
GGTOCXOVTT 
CAAAOACTGC 



41 
I 

CTCAGGACGG 
CATQGG3G0G 
TTGGACTCOG 
AGGCTGC3GGG 
GTOCGGATAC 

CGS CTST OOC 
AGCCTTGCAA 
TOGGAiOCAAC 
CX33C3UU3CCG 



TGCC(MACA 
TCTT6GG6AC 
AAAATAAAGC 



GOOGAnrGGG 
AGCCAGTTCC 
CCCGACACGG 
C3UWAf3ACG9 
CACTQCATAT 
GttGtCCTQC 
GCTGTATTTA 
TOSGGQGCTG 
TGTCTGAACT 



51 
1 

TOAATGGCIC 
OOCTGTCTCT 
AAGACTCCAG 
OCAACCAGAG 
TCAOGCGAGA 
CGCTTCCOGA 
OGAOOGGGITC 
GACCCCAAQC 
TGCTGGCAGA 
OCAOGGGGCG 
GITGCTGCOG 
TOCTGITOGOC 
6GG0GGCAAA 
AGCGAGGGCC 
ACACOGGGGT 
GAGCS^XCiCr 
CCTQTGQAAT 
TATAAGTCIG 
GTCTBATGGA 
GTIAAAAAAA 



Seq ID NO; C4.32 DNA Sequeaoce 

Nucleic Acid Accession #: |gH_052a58.1 

Codiog BGQuence: 54... 1259 



1 
1 

GGCA06AGGT 
ATCCGTOGGG 
CCCACCCAGA 
GGAAGOGAAG 
AGGAGA6GGA 
GAGAOOCGGA 
AACAOGGAGI 
CCTGOOACSaC 



U 
I 

GrTGCX:CTCA 
GGCTOGCOAG 
CCAAGGOCXSC 
CAGOQAOGG6 
□GGGAAOCGC 
CGGAGGCOCC 
TTGGGAAAAA 
AGCa3C3GGCT 



21 
I 

GGTCQCTCCC 
CCCCGGOCCC 
ACCCAOGATC 
AACCGGCGAA 
GAOCGGAACC 
CXKXXIGSACA 
CGQCBCCAAA 



31 
I 

GGG03CQGAC 
□GCCeASAGA 
GACGGCGOGA 
GGGAOGGGGA 
GGGACOGGGA 
CACACAGGGA 
GCC0QACG06 
CSOCCTGGOA 



4X 

] 

AOGQAACCCG 
GCGGGAOCOS 
CQGA£XX3Qaa 
COGGGACOOG 
GAGG6AGAGA 

caaoQGGCCcr 

GGAOG6AGCC 
MBCCCCQGRB 



51 
I 

6CCATGGAAG 
GGAOGGCQCC 
GACCCXX3GCA 
AAGAGnGACC 
GAQAOGGAAA 
CGCGCftGGTO 
CGGGGACTGA 
COGOOGCA6C 



^180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
39^0 
4020 
40 BO 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4S60 
4620 

46ao 

4740 
4800 
4B60 
4920 
4360 
5040 
51D0 
5160 
5220 
5200 
5340 
5400 
5460 
5520 
5567 



60 
120 

leo 

240 
3DD 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 

loeo 

1140 
1300 
1204 



60 

120 

180 

240 

300 

360 

420 

4B0 
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CGCAfiAGGAA C3QQW3ACCCC GGGCXSCCGCA GACCCGAAAG IGAACCCCCT TCGGAOAGAT 540 

ATCTGCCCTC GACCCCCAGQ CCTOGACGAG AfSGAaOTGQA ATATTACCAQ TCJyGftGGCXSG 600 

AACG&CTCCT GGAATGCCAC AAATGCAAAT ACTTQTGCAC TGGGA.6AGCC TGCTGCCAAA 660 

TGCFGOAGGT TCTCCXGAAC TTGCTGATCC TGGCCTGCAG CTCTGTOTCT TACAGTICCA. 720 

3 CAGGGGGCTA CACGGGCATC ACCAQC^TTOS GGGGCATITA OTACXKTCPiS TTCGGAGGGG 7B0 

CTTACAGTGG CTTIGATGGT GCTGAOGGGG AJ3AAGGCCC3V GCAACTGGAT QTOCJWSTTCT 840 

ACCAQCTAAA <3CT(3CXX»TG GTCACTGTQS CAATQGCCTQ TAfiTGGAGCC CTCACAGCCC 900 

TCTGCTGCCT CTTCCTTGCC ATGGGrr6TCC TGCGG9TCCC 6TGGCATTGT CCACTGTTGC 960 

TGGTGAOCGA AjGGCTTGTTG GACATGCTCA TCBOSGGGGG GTACaTGCCG GCCTTGTACT 1020 

10 TCTACTTCCA CTACXTTCrTCT GCTGCX2TATO GCTCTCCTGT GTGTAAAGAiG AGGCRGGCGC 1080 

TGTAOCAAA6 CAAA6GCTAC AGCGGTTTOG OCTGCAISTXT GCAOGGAGCA GATATAGOAQ 1140 

CTGGAATCTT TOCTGCCCTG GGCATTGTOG TCTTTGOCCT GOCSGQCGGTC CTGGCCATaA 1200 

AOGGCTACOG AAAAj^TTAGG AAGCTAAAAG AGAAC3CCA6C AGAAATGTTT GAATTTTAAO 1260 

QGTTTCTAflA AOGCTCTGAC AGATGGAAGT GGIGGTG6AA GGTAGTCTGA GCCACTGGCt 1320 

15 TTCCC ft AGftA TCCCTTGTrG TGGAAigTTTC CAATGCItSGA AAAGCAGCGA GCCAGOSTTG 1380 

GTGTGGTGGG CGOAGCTCCC AGTCGC3W?GG AGCGGTGTTC ATOQATGCAR CRGRCCCrrOG 1440 

CTTCTOGAGT CCTCTGTGAG TOAGGOACCA ATCAAAATTA TTTTTCAAAA AGCAAAAAAA 15 DO 

OXSGOCGGCCT CQGOGGCTCA CAOCTGrAAC CCC3«3CACTT TGGQAiSGeTG AGGTGGGTGG 1560 

ATCACTTGACS OTC3M3GAGCT CGAGAOCAQC TTOOCCRAGA TGCSTGAGOCC COGTCTCTAC 1620 

2X} TAAAATACAA AAAAATIAGC Ca^jGQQSTGGT GGOGGGGGCC TGZAATCCCA GCTACTTGGG 16B0 

AOGCIGI^C AG6ASAAT0S CTTGAA.TCTG GOACSGCaSCSAG AXUSCAGTOA GCGGAGATCC 1740 

CGCCACTGCA CTCCAOCXXA. GGTGACAGAG CGAGACTCCA TCTCAAAAAA AAAAAAAAAA 1800 

Seq ID KOi C:434 DNA fiequence 
25 Nucleic Acid Accession #t Bob aec^ence 
Coding sequence: 261.. 2861 

L 11 21 31 41 51 

D\) GAGCTAGCGC TCAAGCAGAG CCCAGOGCCO TGCTATCGGA GA6AGCCT6G CX?AGCX3CAAG 60 

CQQCaCOaGO A£3CCAQCGGG GCTGEAGCGCG GCXIAGGGTCT QAACCCAGAT TTOCCRGACT 120 

AGCTAOCACr COGCTTGCCC AOGCCCCGGQ AGCTOGOGGC GCCTGGOGGT CAGCGRCCAG IBO 

AOGTCOGGGG CCXSCTGCXSCT OCTGGCCCGC G2U3GGGTGAC ACTGTCTGGG CTACAGACCC 240 

AGA6GGAGCA CACTGCCAGG ATGGGAGCrG CTGGOAQGCA GGACTTCCTC TTCAAOGCCA 300 

35 TGCTQACCAT CAGCTOaCTC ACTCTGACCT GCTTOCCTGG GQCCACATCC ACAGlGGCTG 360 

CTOGGTGGCC TGACCAGAGC CCTGAGTTGC AACCCTGGAA COCTGGCCAT GAOCAAGACC 420 

ACCATOrrOCA TATCGQCXaUS GGCftAGACAC TQCTGCTCAC CTCTTCTQCC ACGOlCTATT 480 

CCATCCACAT CTCAGAGGGA GGCAAGCTGC? TC3UXTAAAGA OCS&GGftOGAG COGATTGTTT 540 

TGCGAACOOa GCACATCZCTO ATTGACAACG GAGG3USU5CT GGKIGCTGaa AGXGCXK^TCX 600 

40 GCCCTTTCCA GGGCAATTTC ACCATCATTT TQTATGGAnG GGCTGATGAA G6TATTCAGC 660 

CQOATCCTTA CTATGaTCTQ AAGTACATTG GGGTTG6TAA AGGAGGCGCT CTTGAQTTQC 720 

JVIGGACAGRA AAAGCTCTCC TQGACA1TTC TGAACftAtSAC CCITCACCCA GGTGGCATGG 780 

CaU3AA6(3AaG CTATTTTTTT GAAAfSGAGCr GGGGOCAOOG IGGAOTTATT QTTCATGTCA 840 

. TOGACCOCAA ATCASSCACA GTCATCCATT CTOaCXGGrT TGACACCXAT AGATCCAAGA 900 

45 AAGAGAGTGA ACGTCTGGTC CAGTATTTOA ACGCOaTGCC OGATGGCAGG ATCCTTTCTO 960 

TTGCAGTGAA rCGAl'GAAGGr TCTCGAAATC TGGATGACAT GGGCAOGAAG GCGATGACXilA 1020 

AATTGGGAAG CAAACACTXC CTGCACCTTO GATTTAGACA CCCTTGGlRGT TTTCXAACTG 1000 

TGAAASGAAA TGC&TCATCT TCSUSTQQAAG ACCATATTQA KIHSXATOGA CATGGAGGCT 1140 

CTGCK3CTGC OCGOCJTATTC AAATTOTTCC AjGACSyGAGCR. TGGOGAAXAT TTCARTOTTT 1200 

50 CTTTGTCCRG T6R6TGQ(3TT CAAGAOSTGG AGTGGACXSGA 6T0GT1*CGAT ChSCGATAAAG 1260 

trATCrCAOAJC: TAAAQOTOaQ GAGAAAATTT CAGACCICTG GAAAGCrcaC OC3U3QAAAAA 1320 

TATGCAATCX3 TCCCATTGAT ATACAGGCCA CTACAATGOA TGGaGITAAC CTCRGCRCCG 1380 

iUSGTlGlCTA CAAAAAAGGC C3\fiGATTAXA OSTTTGCTTG CTACGACCGQ GGC3U3AGOCT 1440 

- GCGGGatGCTA CCGTGTACGG TTCCTCTQIG OOAAjtSCICnaT G2USGC9CCAAA GTCACAGTC3V 1500 

55 CCATTGhCAC CAATGX6AAC AGCAOCATTC TGAACTTGGA GGATAATGTA CKSmOGOA 1560 

AACCTGGAGA TACOCTOQTC ATTtKX!A13TA CTGATTACl^C! CATGTAOCAC GCAGAAGAGT 1620 

TCCAGGTGCT TCCCTGCaGA TCCTGOGCXX: CCAACOWSGT CAAAOTGGCaw GQGAAACCaA 1680 

TOTACCTGCA CATOGGGGA0 OAQATAGAOQ GOGTGOAC^T GCGGGGGGAG GTTGQGCTTC 1740 

^ TGAIGCCX3GAA CAYCHTAGaXS ATGG9GG»GIA TGGAflGACAA AUQCTAlCXXX! TACAGAAACC IBOO 

OO ACATCTGC3UI T T TCTTTOAC TTOaRT3UaC!T TTGGSGSCCA CATCAAGTTT GCTCIGGOAT 1860 

TTAAGGCAGG ACACITG6AG OGCAOGGAGC T6AAGCATAT GGGACAGCAG CTGGTGGGTC 1920 

AjtSTAiC3CCaAT TCACTTCCAC CTGGCCGGTG ATGTftGACGA AAGGG3AI3GT TATOAOCCaw: 1980 

CCACATACftT TCCATCCATC ATACHTETCTC TGGCIGOGIC ACASTGCATG 2040 

^ GCTCCAAXGe CTTGTTQA!XG A2USGIU3BIT6 TGQGC3ATAA CTCrTTGOGC GAClQCrCCT 2100 

05 TCAOGGAAGA TGGGCCGSnO OAAQOCSUUSL CTTTTGACCA CTGTCTTGGC CTOCTTGTCA 2X60 

, AGTCTGQAAC CCTCCTCCCC TOGGAOQGTG ACAGCRAGWr OTGCAAQATG ATCAC5VGACG 2220 

ACTCCTACCC AGGGTACATC CCCAAGCCCa GOCAAflACTG CAATGCTGTG TOCACCTTCT 2280 

GQATGGCCAA TCCCAACAAC AACCTCATCSL ACIGTGCCGC TGCAOQATCT GAGGAAACIG 2340 

GATITTGGTT TATTTTTCAC CACIGXACXM CGOaOCQClC GGXGGGaWTQ TACT000C3US 3400 

70 OraMTCAOA OCACATrcCa. CwaaflAAAftr SCTATRACAA OOSRGCRCAT TCCAACTACC 2460 

GGGCTQQCAT GATCMAGAC AAOGGAGTCA AAACCAOOGA QdCCTCTGCC AASGRCRAGC 2520 

GGCOOTTCCT CTCAATCATC TCTGCXMAT ACAGCCCTCA CCAGGAOGCC GACCXaaCTOA 2580 

AGCCOGGGGA GOCOGCCATC ATCAGACACI TCATTQCXITA CAAfiAAlCCAG GACCAOGGG6 2640 

CCTGGCraOS CBGOGGQGAT QTQTOaClGG ACAGCIGCCA TTTCAGAGGO GAGGCTCAGS 2700 

75 AAGGCTTCTT GCTTAt^AGGA ATGAAOGCTG GGGGC31TTTT GCTGGGGG6A GATGASSCAG 2760 

CXrrCXiGGAAT GQCrCAGGGA TTCAGCCCTC CCTGCCXSCTG OCTGCTTGAAG CTGGTOACrA 2820 

OGGG6T0GCC CTITGCICAC GTCrCTCTGG OCCACrrCATG ATGGAGAAGT GTGGICAGAG 2880 

GGGaoaiATG GGGTTTGCTQ CTTAi:GAGCA CAGAGGAATT CAGTCCCCAG GCAGCXXTQC 2940 

0^ CTCTGACTGC AAGAGG6XGA ASTCCACnOA AOTOAGCZIGC TGCXaflAGGG CCrCATTTGC 3000 

oO TCTTCKTOCA GOGAACTOAG CACAGGSGGC CTCCAiGGAGA CCCTAOAZGr GCTOGTACTC 3060 

CCTCGOCCTG GQATTTCAGA GCTGGAAATA TAGAAAATAT CTAGOCCAAA GCCTTCATTT 3120 
TAACAGATOG GGAAAGTGAG COOOCAAGAT GGGAAAGAAC CACACAGCTA AGGGAGSGCC 31B0 

TGGGQAGCCX: CACOCTAGGC CTTGCTGCCA OlCC2U3VnX3 CCTCAACAAC: CGGOCXXAGA 3240 
GTGCOCTUGaC ACrCCIGAjSG TAaCTTCTOQ AAATBGGGAC AAQTCXXXTTC GAAGGA2U^ 3300 
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AAATGACTAS AaTAOAATG?^ CRGCTAGCftG ATCTCTTCCC IXXTTGCTCCC AGGGCACACA 33€0 

AA.tXXX3CCCX OOCCTTGGTQ TTGGCGGTCC CrrGTGGCCTT CACtTTGTXC ACTACCTQTC 3420 

JU3CCCAGCCT GQGTQCACAG TAGCTGCAAC 'XCOCCATTGG TGCTACXmaa CTCTOCTGTC 34 BO 

TCXGCAGCrC TACAGGTGAfi <3CCCAG<ZAGA 6GGAGTAGGG CTGGOCA3GT TTCTQGTGAG 3540 

5 OCSUMrrTGSC TCATCTTOGG TXSTCTC^CA GCTATTGGST OCACCOC3«3T CCCrrrCAGC 3600 

TGCTGCTTAA TGCCCTGCTC TCTCCCTGGG CCACCTTATCA GAGAGCOCAA AGAGCTCCTG 3€60 

TAAGAGGGAG AACTCTATCT GTGGTTTATA ATCTTGCACG AGGCACCASA QTCTCCCTGG 3720 

GnZTTGTGAT GAACTACATT TATCCCCTTT CCTCCCCCAA CCACAAACTC TTTCCTTCAA 3780 

_ AGA06GCCTG CCIGGGTOCX: TCChCCCMiG TGCACCCAT6 AGACTOSGTC CAAGAGXCCA 3940 

10 TTCCCraOGT OGGAGCCAAC TGTCAGGQAG GTCTTTCCCA CCAAACATCT TTCAQCTQCT 3900 

GGGAOGTGAC CATAGGGCTC TGCTTTTAAA GATATGQCTG CTTCAAAGGC CAGAGTCACA 3960 

GGAA5GACTT CTTCCAGGGA GATTAGTGOT GATGGAGASG AGAGTTAAAA TGACCTCATTG 4020 

TCCTTCTTGT CCACX3GTTTT GTTGAG'iTTT CACTCTTCTA ATGCAAGGGT CTCAGACTGT 4060 

GAAOCACTTA GGATGTQATC ACTTTCaaOT GGCCAGGAAT 6TTGAATGTC TTTGGCTCAO 4140 

15 TTCATTTAAA AAAGA^ATCX ATTTGAAAGT TCTCAGA0TT GTACATATGT TTCACS^AC 4200 

AGC3ATCTGTA CATAAAAGTT TCTTTCCTAA ACCATTCACC AA6AGCCAAT ArCTAGGCAT 4260 

TTTCTTGGTA GCACAAATTT TCTTATTGCT TAGRAAATTG TCCTCCTTQT TATTTCTGTT 432 Q 

TGTAAGACTT AAQTQAOTTA GGTCTTTAAG GAAAGCAAOG CTGCTCTGAA ATGCTTGTCT 4380 

Tm^CKm GCQGAA&XAG GTGGTGCTTT TT006GAQTT AOATGTATAa MSSGTTCGfCA 4440 

ZO TGZAAACATT TCTTGTAGGC ATCACCATGA ACmAAC3Amr ATTTXCTATT TRTTTAITAT 4500 

ATOTGCACTT C:AAGAAGTCA CTGITCAGAGA AATAAACAAT TOTCTTAAAT <3T(=ATaATTG 4S60 

GAGATGTCCT TTGCATTGCr TGGAAOGQOT GTAOCTAGAG CCAAGGAAAT TGGCTCTGGT 4620 

TTOGAAAAAT TTTGCltStrra. TTATAGTAAA CATACAAAGO ATGTCAAAAA AAAAAAAAAA 4680 

2^ MAAAAAAAA AAAAAAAAAA AA 4702 

5eq XD VOi C317 Protein SequfiElce 
Sxote£n Accession #s HP_00580S.i 

^-v 1 11 21 31 *1 51 

30 I I I I I I 

MVGKMWFVLW TIXAVRVTVD AISVBTPQDV LHAfiQGKSVT LPCTYHTSTS SREGIiXQWDK SO 

IiLLTHTEBW IWSFStlKNYI HGELYKHRVS ISXINAEQ5DA OITIDQiyCHA DNGTYBCSVS 120 

LM&OLBGNTK SRVSLLVLVP PSKPECGIBG BTIXGNHIQL TOQSKBGSPT PQYSnKRYNI IBO 

ZiIiQEQFZ>aQP AdGQFV&LKN XSTDTBSyYI CTSSKBEGTQ FCSVLTVAWS P&MBiVALieVG 240 

35 lAVGSnmALI IIGXIIYCCC CRGKOEaarED XSDARSklSfiA YBSVSEQLBS LSRBIUSBBDD 30Q 

YSQBEQRSTG RESFDHU^Q 319 

Seq ID ISO z UIB Pixitiein Sequence 
Protein Accegsion #: Boa segjacmce 

40 

1 11 ZX 31 41 51 

1 I I 1 I I 

HGSRTPESPL HAVQLRnGFR RBFPJjXiFLI^ LLI.PPFFKVG GFNLDAEAPA VI>aGPPGSF7 60 

G^FSVEFZBPS 7D6VSVX.VGb. PKKHPrSQPGV LQGGAVYLCP HGASPTQCIP ZSEBSHBSRI. 12D 

45 IsBGSLfiSSBS BBPVBYKSLQ NFGATVXABG SSZLACAFI>Y SHRTEKEEIiS DPyGTCVXiSr IfiO 

DHFTRll^YA PCRSDFSWAA GQGYCQGGFfi ABPTKT<^W LOGFGSYFHO GQUkSATOEQ 240 

XABSYYPSYL ItlLVQGCJLQT S0A88IYDP8 YIOT^AVOE F5QDDTEDFV AGVPKGKLTY 300 

OinrriLEIGSD ZRSCiiraFSaB QHASyPCSyAV AA10VN(^6L 39DL£VG;APIiZ> HDRTPDGRPQ 360 

EVGRVYVYLQ HPAGISPTPT LTLTGHDEFO KFQSSUXBC^ DLDQDGYHOV AZGAP9GGET 420 

5U QQBWFVFPa GPGGDSSKPS QfVIiQPLffAAS BTPDFF6BAL RGGRDU3G(!K3 YPDLZVQ9PQ 480 

WKAWYRGR FIV8A6ASLT IFFAMFNPEB RSCSLBOHPV ACnSLSFdJ!! AfiGKHVADSI 540 

OFTVEljQLDQf QKQKGGVSISA U?IiA6R0ATL TQTIfLIQHGA R&DCREMKXY I.RNESEFHDK 600 

IiSPIEIALHF SZ^POAPVDS HGLSPAUiyQ SKBRIBDKAQ ILLIX3GEDHI CVPDLQLEVF 660 

GBCilNBViaiGD XNAUnUFBBA. QNVCTGGKTB AELRVTAPFB ABySGLVRBP OHFBSLSCaay 720 

55 PAVN0SR1>i(V CDZJWHSAQ ASIMOQIAFI -VPHI.RDTKKT XQRDFQU^K HliKHSQSDW 780 

8FRI.8VEAQA QVTLHGVSECP BAVZ.FPVSDH BPRDQFQKEE DliGPAVHHVT SI.XHQGPSSI 640 

SQGVLEIiSCP QAI.EGQQ[iIiY VTRVTGUICT THEPIllPKGL ELDPEG&LHE QQXREAPSRS 900 

&A6^QvQlt£. CPEASCPALR CBLGFLHQQB 8Q8LQLHFRV NAKTFLQREH QPFfiLQCEAV 960 

^ YKAIiKMFYRl I^PSQIiPaKER QVATAVQBCTK AESSYSVPLH XIlIAILFGIi U^&tiJYXIi 1020 

OU YKtdeFFKRBL FYGTAMBECAQ LKFPATSDA 1049 

Seg ID HO$ C219 Protein Sequence 
Protiein Acceesioa ft: VP_002412.1 

65 1 11 21 31 41 SI 

• I 1 1 I I i 

KBSPPPLLLI. LFHGWSHSF PATLETQEQD VDItVQKYUEK YYBII.K1IDGRQ VfiKRitHSGFV 60 

VEKUKiOHQEF FOliS^tTGKPD AETI.KVMROP RCGVPHnTAQF VLTBGNFRHB QTHItTYRIEN 120 

YTPDLiP&AW tlBAIBKAFQL KSHVTP!LXFT ICVSBGOADIM ISFVRmHRD HSPESGPGGN IBO 

70 ImSOXSQBGSG ZGGOAEEDBD ERHTHNFilET KlfiRVAAEEI. GBStiGX>GHST DIGALMYPSY 240 

TPSGDVQUVQ DDZI3GXQAXY G&SQKPVQPI GPQTPKACD8 KLTPDAITTI RGEVMFFKDR 300 

FYMKTNPFYP EVEUIFISVF VWQLENGhEti. AYEFADBDEV RPFKGNKYnA VQGQNVIiHGY 360 

PKDXYSSFGF PRTVJCHIDAA laSESHTQKTY FFVANKYHRY DBYKRSMDPG YPStMIAHDFP 420 

GXGHKVDAVF HiCD6FFyFEH GURQYKFDPK TKRXUTLQKA HSHEGlCRKir 469 

Seq ID NOs C22D Protein Sequence 
Protein Accession tt> Boa sequence 

nr\ 1 11 21 31 41 51 

80 I I I 1 ) I 

HHSFPPUitili ££ffGWSHSP PATLBTQBQO VDLVOKYLEK YYNIiKHDGRQ VEKRRNSGPV 60 

VEKlilDQHDEP FGLKVTGKPD ABTI.1CVMKQP ROJVPDVAQF VLTBGNPRWE QTHLTYRIEN 120 

YTPDIiPRAlTV DHAIBKAPQL (9SNVTPLTFT KV8EGQADXH ISFVSGDHRD NSPFD6PG<ar 180 

1-ABAFQPBP6 IGGDAHFISIED WXTm^WSX NZaERVAAIlAL GHSLGIiSHST DIGAIMYPSY 240 
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TPSCSDVQIiAO T>DIT)GIQAIY GRSQNPVQPI GPQTPKAC3)S KLTFDAITTI RGEVMPFKDR 
FYMRTNPPYP SVEIJIFlfiVF WPQLPNGIjEA AYBFADRDEV HFFKGNKyWA VCJCCNVLBGY 
PKDIYfiSEGF PRTVKHIDAA LSEEKTGKnf FFVANKYWRY DEYKRSMDPG Yi»KMUHDPP 
GIGHKynAVP MKDGFPYFFH GTRQYKFDPK TKRXLTLQKA KTSWFHCRXN 



Seq ID HQ: C221 Protein Saquence 
Pxotein Accession #! HP_0j55i46.1 



1 
1 

MVRKFW9TI 
GIPISPKGVL 
LPAFVRVWVE 
SMSVeWSARI 
YGHYAYAQWF 
NAVAVTPSEK 
RKHTPI*PAVI 
FKVPLFIPAL 
8BKXTRTLQZ 



11 
I 

SKGG YLQGMV 
QNTGSVGMSIj 
LDIIEPAATA 
QIFLTFCiCLT 
YliNFVTEEVB 
LLGHFSLAVP 
VLHPIiTHlMI. 
FSFTCLFMVA 
IICWPEEDK 



21 
I 

NGRLPSliOSK 
TIWTVCGV1»S 
VIBLAFGRYI 
AZIiiriVFGV 
HPSKTIPLAX 
IFVAIjfiCFGS 

IiSIiYSDFFST 
L 



31 
I 

EPPGOEKVQIr 
LPGALSYAEE, 
I*EPFFIQCEI 
KQLIKGQTQir 
CISMAITI6V 
MNGG7FAVSR 
FIiSFAHWLPI 
GIGPVITLTG 



41 
I 

KRKVTIiljRSV 
GTTIKKSGGH 
PBLAIKIiITA 



YVLTNVAYPT 

I>FYVA&REGH 
GLAVAGLXYI. 
VPAYYIiFlIH 



SI 
I 

GIIIGTIIGA 
YTYILEVFGP 
vaiTWMVI^ 
SITBLPEAFY 
TZHMEBLZ.LS 
IiPEILSHIHV 
RY3CCPOMHRP 
DKKPRWFRXH 



Seq TD VOi C2Z2 Frobein Sequence 
Probeln Acceaslon #a VP_0<13237.1 



MGI.AHGLGVZ> 
ZSDAMItXPPV 
SNGKAGTLDL 
VPIOSVFTHD 
TLDNNWNGS 
VTEEtfKBtiAN 
ATVPDGHCCP 
RTCHIQECDK 
TKACKKDACF 
CSnCQDCPIDG 
NHNGEHRCEN 
AKCHYIiQHYS 

H13PDQADTDM 
NPDQIiDSDSD 
DQIPDDKDlircr 
QM3PU3FKGT 
DYAOFVFGYQ 
HHTGKTPGQV 



11 
1 

FIiMHVOGTHR 
PDDKPQDLVD 
SliTVQQKQHV 
tiASIARIiRXA 
SPAZRTNYIG 
ELRRPPIiCYH 
RCHP9D&A13D 
RFXQDGGNSH 
XHOGWOPHSP 
CLSNPCFAGV 
TDPETmOjEC 
DPMYRCECKP 
KDGIGDACDD 
NGE6DACAAD 

RLVPNPDQKD 
SQNDPNWWR 
S6&RFYVVMN 
RTLWHDPEiai 
FVFGQEHV^ 



21 
I 

IPE8GGDH3V 
AVRAEKQEliL 
VSVEEAIiXAT 
KGOVNDNFQG 
HKTKDLQAIC 
WGVQYHMNEE 
GWSPWSEHTS 

nspnascsvir 

WDICSVTCQG 
iQCTSYPDGSW 
PPRFTGSQPF 
OVAGMQIIGG 
DEONDKIPDD 
XDODaxXiNBR 
DIDEOCaiQHN 
SDGDGRQDAC 
HQGKSLVQTV 
KjQfVTQSYHDT 
(SMXDFTAYRW 
SIXLSXEICRDP 



31 
! 

FDXFBLTGAA 
lASLRQMKKT 
GQWKSXTIjFV 
VLQNVRPVPO 
GXSCDEIiSSM 
MTVDSCTHCB 
CSTSCGUJGIQ 
OSDGVITRIR 
GVQKESItLCN 
KOSACPPGYS 
G^yiVEHATAN 
EDTHLDGWPN 



41 

I 

RKGSGRRLVK 
RGTLULLBRK 
QBDRAQDYIS 



51 
I 

gpppsspafr 
DHSGQVFSW 



CNCQfyVYNVD 
LDNCPYVPHA 
KDDFDRDSVP 
NCDPaiAVGY 
NPTRAQ6YSG 
RLSERPKTGF 



VLELRGliRTI 
CQNevTlCKK 
QRGRSCD3IJI 
LCHSPSP(3»IH 
S9PAPQFGGXD 
GNGIQCTDVD 
KQVCKPRNPC 
ENXiVCVANAT 
AiQYDYDRDDV 
QSDTDHDGVe 
NQADHDKDGK 
DIPDXCPBKV 
DEFNA VDFSG 
XjSVKWEISTT 
IRWMStB^CK 



GCS8STSW« 
VTTLQDdXRK 
VSCPIMPCSH 

GKFCB6BARB 

cvanvTEnQx 

BCKEVPDACF 
TD<3THDC2iIKN 
YHCKKDHCPS 
ODRCDHCFYN 
X3QCDHCFLEB 
QDAGDHDDDH 
DISETDmSF 
TPPINTEHDD 
GPGEHLKMAIi 



30O 
36D 
420 
469 



60 

120 

180 

240 

300 

360 

420 

4ao 

501 



60 

120 

180 

240 

30O 

360 

420 

480 

540 

€00 

660 

72Q 

780 

840 

900 

9eo 

1020 

loeo 

1140 
1170 



Seq ID NO: C!!223 Protein 8eq[uence 
Protein Accession fti NP 002183,1 



1 
I 

KPLDniiRGFIi 
HlXjNMLHZtKK 



BPQGSU>T6B 
RXACBQCQES 
EDHPHRRRHR 



11 21 31 41 

lilt 
liASCttXXVRS SPTPGSEGBS AAPUCPSCAI. AAX>PKDVPHS 
5PDVTQFVPK AAUiBIAIB39QCi EVGKVGEHGY VEXEDDlGSR 
TARKTXAFKI &KEGBDLSW BRABVWZfLK VPKU^IRTRTK 
BAEevaunGB RSELX>LSEinr vdkrkstwbv ffvsssiqri^ 
GASLVItXiGKK KKKEBBGB9K KIQQGGBOGAG ASEEKEQSHR 
GLECDGKVNI CCKKQErVfiF KDIGWNDWII APSOXHAHYC 
ajVIBBYRMRG HSPFANIiKSC CVPTKUtPHfi KLYYBDGQKX 



51 
I 

QPEHVSAVKK 
ABMfilELMBQT 
VTXRLFQQQK 



PFLMIiQARQS 
SGBCP&HIAG 
XKKDXQNHlir 



Seq ID KO: C224 Protein Sequence 
Pxotein Accession #s KPjDOOQBfi.l 



1 
I 

NSVPDTACVLL 
FI»KNTV|aBCD 



ADCVIiERDGS 
EDVDKDGIGD 
DTZX)DGRGDA 
VDHDFVGDAC 



PBQiAQIDPN 
FGYQDSSSFY 
BSQVJUjLWKD 
RZiOVFCFSQE 



11 

i 

IiTLAAI^GASG 
ACX3MQQSVRT 
OSAHPCFPRV 

SSCVCKVGHA 
ACDPDADGDG 
CDDDlDGDRl 
D8PQDQDGDQ 
OQEDADRDGV 
WWtHQGREJ 
WI4WE0MEQT 
PRNVGWKDKK 
NXZKAHIjRYR 



21 
I 

QGQSPIiGSDL 
GLPSVRFLLH 
RCXHTSPGFR 
SeSFQCSGPCO 

otgilogrdt 
vp'kekdjsicpl 
bnqadhcprv 

HQDSBDErCPT 
ODVCQPDFOA 
VQTMtfSDPGL 
YWQANFFRAV 
SYRWFLQHRP 
CHDTXPEDYE 



31 
1 

GPQMIiRELQB 
CAPOPCFPGV 
CBACPP6YSG 
FOFVODQAGG 
DCtXSFSDEKb 
VRKPDQKNTD 
FHSDQKDSDG 
VPNSAQEDSD 
DXWDKIDVC 
AVGXTAFHQV 
AEPGXQLKAV 
QVGYIRVRFY 
THQIRQA 



41 
I 

TNAAZ42t>V»D 
ACIQrrBSGGR 
FXHQGVaLAF 

RCFEFQCSIKD 
EDKMGDACDlf 
DGI<S3ACnHC 
HDGQGDACDD 
PENAESVTUID 
XIFfiGTFHVIST 



51 

I 

KLRQQVSEIT 
C0PCPAOFTG 
AKANIDQIVCTP 



SGPELVADSH 



HCVTVPSI6GQ 
GRSQKUDDf^ 
FQK&NPDQAD 
DDDMDGVED3 
FHAFQTWLD 
VTilUUVAQFI 

WIiDTTHBOQ 



Seq XS MO: C225 Protein Sequence 
Sxotein Accesaion NP_612464 



11 
I 



21 

I 



41 



60 

3.20 

180 

240 

300 

360 

420 

426 



60 

120 

180 

248 

300 

360 

420 

4B0 

540 

fiOO 

660 

720 

757 
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MRPQGFAA3P QRJ^RGLIiUJi T»LQUPAPSSA SEIPKGKQKA QLHQREWDL YNGMCLQGPA 60 

GVPGSDGSPS AWOZPGTPGl F6RDGFKQEK GEdiRBSFEE SHTPNYIOQCS WSBLHYGIDIi 120 

GKXABCTPTK MRSNSKLKVI. FSGSLRUCCH. NACCQHWyPT FNGAECSGPL PIEAIIYIJ3Q IBO 

THRTSSV£6I* CBGIGAOZiYD VAIWV13TC9D YFKiauiSTGM NSV6RIIIEB 240 

5 LFK 243 

Seq TD 310 s C226 Protein Sequence 
Probeln Accesgican HF_003216.1 

10 1 11 21 31 41 51 

I t I I i 1 

HATHENKVIC AI.VLVSHLAI1 GTLABAQTET CTVAPRERQK OQFPGVTPSQ CANKGCCFDD 60 

TVRGVPWCFY PNTIDVPPBB ECBF 84 

IS Seq NO; C227 Protein Sequence 
P]COtein Accession 9r KP_0£6234.1 

1 11 21 31 41 SI 

OA I ^ ^ ^ ' ' 

aX) HPKBAHHGAI* SWXrIIJ«UGH PRVALACPHP CftCYVPSEVH CTFRSIiASVP AQIAIlHVERI 60 

HliOENSIQUi S&lSFASLrK LEIiLHIHGNB IPSJPDGALR I3LSSLQVFKF SYMKLSVITQ 120 

QTLQC3I»SRtiK RliHTEHNKIB FrHPOAFlflGb TSLRLT^HIiBG miXJiaijEPST FSTFTFLDYF 160 

AIiSTiaHLYL ABNMVRTLPA SKIiRHHPIiLB KZiYIiQCHPHT CDCEMRWPtiB mXAKSBGXLIC 240 

^ CKKDICAYEOG QLCAMCPSPK KLYKHEIHKL KTSTTCUCPai ESPLHQNReR SIEEBQBQEE 30D 

25 SC3GSQ£>ILEK FQLPQKSXSIj NMTDEHONmr VIiVCDIKKPK DVYKIHIiNQT l)FSDlDI»Alr 360 

VAIiDFBCPHT RENYEKLHKL lAYYSEVPVK IiHUELHLSKD ERVSYQYRQD ADEBAJiinfXG 420 

VRAQIIAEPE HVMQPSIDIQ I)HECRQST»KK VLLSYYTOYS QTISTKDTRQ ABGRSHVMIS 400 

PSGKVQBDQT VLIIOGPOCLS CNV^BEISFS IFWVIiPDGSI LKAPMDDPDS KFSII>S8aHI< 540 

RIK8MEP8I>S GIiTQCIAOVR DEMDHMVYRV LVQSPSTOPA EKDTVTJGKH PGESVTLPCK SOO 

30 AlkAlFSAHLS WXLStlBRIIN DIANTSHVYM LPNGTL&IPK VQVSDSGXTS CVAVNQQOAI) 660 

HFTVGTTVTK KG9GI*P9KRlQ BRfiGAiQUjSR VRKDIVEDBG QSGHGDEEEH: BRRLLHPKDO 720 

EVFXiKTEaSDA l^tGDKKAKKS RRKUOAIKHa BKEPSIMVAE 6RRVFBSRSR IHMANKQIHP 7B0 

ERNADIIiAKV RC3iCI3ItFR(3T£ VPPIjIKTTSP P£IZ>8L5VTPP FPAVSPP^B FVQfTVT&AEB 840 

SSAEVPLXiQB EEHVLGTIS6 ASHGLEHNHN GVILVBPSVT STPLEEWDD I^SEKTEEITS 900 

35 TEGDLKGTAA PTIiXSEPyEP SPTLHTLDTV YBKPTHBBTA TEQH5AADVG &fiPEPTGSEY 960 

BFPUJAV5EA BSEPMQYFDP DLETKSQPDB BKNKSDTFAH LTPTPTIHVN DSSTSQI^FSD 1020 

STI6EFGVPG Q9EIiQGI>TDKI ZHI>VXSGLeT QDTLLIKKGH KEMSQTIiQSG NHliEQDPTHS 1080 

liSSBSEGQBS KSXTLPDBTL GIMSSMSFVK KPAETTVGiri* liDKDTTTVTT TPRQZVAPSS 1140 

. TMSTHPSHEtR PISGRRSIiRPN KFBHBHKOl^P PTTFAPSSTF STQPTQAPDl KI&GQfVB88L 1200 

40 VPTAHVDHTV SlTFKQLBMBK NAEPTSKCTT? nsiOiaKSPKK HRYT?SWSS RASGSKPSPS 12£0 

FSEIKHSHIVT PSSETIIiTiPR TVflliKTBePY DfiLDYMTTTR KIYgSYPKWQ ET1*PVTYKPT 1320 

SDGKBZXDDV AINVDHHIKSD IIArTGBSITN AXFTSRSLVS TMGEFKEESS PVGFPCTTPTH 13 BO 

NPSRTAQPGR LQTHIPVTTS GENIfTDPPLL KSLEDVDFTS BPLSSimrST PFHQBEAGS& 1440 

TTLSSIKVEV ASSQABrTTl. DQpHLBTTVA lIiIjgBTEPQN HTPTAAKMKS PASSSPSTIL 1500 

45 MSLGQITTXIK PALP8PRISQ ASRPSKENVF Il^IYVCailPETB ATFVSSKEGTQ HMSQPHELST 1560 

FS9DRZ3AFIIIt STKI^EKQV FGSRSLPRGP DSQR0DC3RVH ASHQLTKVPA KPILPTATVR 1^20 

ErPSMSTQ&A8 RYFVTSQSPR HHTNKPBITT YPSOALPEKK QFTTPRIiSST TIPI.PU2HSIC 1660 

PS1F9KFTDR RTDQEllGY&K VPGUIINZPEA. ENFVgXPPSP RIPHYSNtSlL FFFTNKTLSF 1740 

PQLGVTRRPQ IPTSPAPVKR ERKVIPQSYN RIHSHSTFHL DFGPPAPPI.1. HTPgTTaSPS 1800 

50 TNIiQIIlPKVS STQaSIGFXT 8SVQ8SG&FH QSSSKFFA0G PPASKFffBLG EKPQII»TJC3P ISfiO 

QT7SVTAETD TVPPCEATQK PKPFVTWTKV STGAU4TPNT RIQRFEVIJai GTIjVXRKVQV 1920 

ODRGQYHCIA SNLaCLDKHV VIiLSVTVQQP QXLASHrQDV "PVYIiGiyriAM ECLAKGTPAP 1930 

QISHIFFDKR VHQTV9FVES KTTLUEHRTL 8ISCEASFSDR OVYKCVA^A AGADSLAIRL 2040 

HVAAIiPPVIH QEKLEHISLP P6LSIHXHCT AKAAEIiPSVR KVLGDGTQIR PSOFISOHIiF 2100 

55 VFPHOTIiYIR NIAFKESGRY BCVAAlSIiVGS ARRTVQLNVQ RAAANARZTG TSPRRTOVRY 2160 

GSTLKLDCCA GGDPWFRZX.H RLP5KRHIDA liFSFDSRIIW FANGTLVVKS VTPXDAIXTYli 2220 

CVAHNKVCSESD YWLKVDWM KPAKXSHKE& NDHKVFY66D UCVDCVATGL PNPBISWSLP 2260 

DG8LVHSFKQ SDD5GGRTKR YWFNNCTItY FHEVOKRBEG DYTCFAENQV 6KDEMRVRVK 2340 

WTAPATXRN RTYIiAVaVPY QDWrVftCBA KGEPHPJWTW IiSPTSKVlPT S8EKY1QIYQD 2400 

60 GTI^IQKAQR SDSGNYTCIiV RH9AGES3EICr WXHVKVQPP KINGHPHPXT TVSEXAASGS 2460 

3IXI.IDCKAEG IPTPHVUIAF PBGWI^PAPY YC^XTVEQfiT QfiUJlRSLRK SDSVQIiVCMA 2520 

SKEGGraOUJI VQLTVLBPMB XFXFHDFXSB KITAKAGHTI SLE3C3AAGTP TTPSLVKVLPH 2560 

GTQLQ&GQQL QREYEKADGH I^ISGLSS^m AQAYKCVARN AAGHTERLVB LKVGUKPEAK 2fi40 

KQYHdliVSXX NGETLEIiPCT PPGAGQ<%FS WTLPNGHHIiB GFQrLGRVSL USilGTUlfVRe 2700 

65 ASVPCRGTYV GSHBTETOPS VTBIPVIVXA YPPHItSEPT PVIYTOEGJ9T VBIiHOttMOI 2760 

PKADXTWELiP DKSHI.1CAQVQ ARIiYGNRFLH PQGSLTXOHA TQRDASFYKC MAKHILGSD8 2B20 

KTTYIBVF 2826 

Seq ID KOs C22d Protein Sagi&snce 
/O Protein Acceaslon. Eoe eequence 

1 11 21 31 41 51 

I 1 I I I I 

— . MPGTKLTRTG APADYIEVII»K TSOEDHLDVP DDX&VRVM8S QSVXiVSHVDP VIiEKQKKWA 60 

75 SRQYTVRYRE KGEUURWDYK QlANRHVLIK JILrPDTVYEF AVRIGQGERD GKNST8VFQR 120 

TPESAPTTAP ENLNVWPVNG KPTWAASWD AIiPETEGKVK VCXxLOTGLPa V3SFQPSAKS 180 
FOHTFFHTPR IiSUHLEQSPS PXLGTIfliliPW WMVCSJCGNAX FSK8GPQTGB AWDLTPKPSI- 240 
SLCQQECSCr QKDP8CLAYI. XPXQTKQVNK DPQILBGSVFG PCFXiFYFIiTF KU7IGGF&FI 300 
- V4CYEBP?VSS liTGHSLKSVA A)6EADVQaHT EDHGDKPEKPB P8&P8PRAEA SSOBPSVPAS 360 
oU PQGRSIAKDLIk LDLKNKXLAK GGAPRKPQLR AKKABBIJ>LQ STEXTGBBBI* G8RED8SH8P 420 
SDTQDQKRTL, RPF5RHGHSV VAPQRTAVRA IU4PAIiFBREG VDKPGFSLAT QPRPQAPPSA 480 
SA8PAHKA8T QGTSERPSI#P ASUUDtSDhVD SDEDERAVG8 LHPKGAFAQF KPALSPSRQ8 540 
PSSVIiRDRSS VHFOAKPAfiP ARRTPKSGAA. BEXISSASAPP SEtliSPPHOGS SRLLPTQPEZi GOO 
SSPXtSKOOKD GBDAPATEVaK APSRSTBISSS VSSHLSSRTQ VSBOASASDG E8HGD^3RBD 660 
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GQRQABIVTAIQ TEiRASPASGH FHEjLRHKPFA JUaGASPSRFS T6RGPRLQPS SSFQSTVFSR 720 

AHPAVPSHSP SHPKIiSSfilB ODBEEDBKFLP ATWNDHVPS SSRQPISRGW EDIjaRSFQRiQ TBO 

ASEiHRKEPrP EIJPKSTI3KDT EPQGKYS9LA SKAQDVOQST DADTEGH6PK AQPGSTDRHA 840 

SPARPPATO^S QQHPSVPRRM TP&HAPBQQP PPPVATSQHH PGPQSRDJWSR SPSQPRLSLT 90Q 

5 QAeHPRPTSQ GRSHSSSDPy TASSRGMItPT ALQNQDEDAQ GSYDDDSTEV EAgDVRAPAH 960 

AABAKBAAAS &PKHQQVBSP IGACAGGDHR 5QRGHAASPA RPSRPQOPQS RARVP9RAAP 1020 

GKSEPPSKRP LSSKSQQSVS AEDEEHSDAlS FFKOOKEDIiI. SSSVPBCNPSS STPSGGBODAD 10 BO 

GSEtAKEBREP A7AUVPRI30S liAPVKRPIiPP PPG9SPRASH VPSRPPPRSA ATVSFVAGTH 1140 

PMPRYTTRAP PGHPSnWDL. SLRQRMMHAR FRWPIxSRQPA RPSYEQGYNG RPNVBGKVLP IZOO 

10 6SEIGKPHOQH IINQPQQTKH WDEJJRGIjVL NAEGRYLQDS HGNPLRIKLG <3DGRTXVBL& 1260 

CSTFWgpOGi:* PIjIiGQSRHGT PLANAGHIKPZ ItSLGGKPLVG LEVHOCTTHP PTTTMQPTTT 1320 

TTPliWTTTP RPTTATTMQP TTTTTPI^TT TPRPTTATTR HTTTKRPlTr VMTTRTTTT 13 BO 

TTPKPTTPrP TCPPGTLEHH DODGNLIKS? NQIPECVAEE DEFSQLETDT AVPTEEAYVI 1440 

YDEDYfiPETS RPPTTTEPST TATTPHVIPE EGAISSPPBE EPDLAGRKRP VAPYVTYUTK ISOO 

15 DPSAPCSLTD AZJ3HFQVDSL DBZIPKIDUKK SDLPPQHAPR NXTWAVEGC HSFVIVDWDK 1560 

ATPGDbVTGX LVYSASYEDF IRl»CFSTQftS SVTaitPZENIi ICPimtYYFKV QAQHPHGVaP lfi20 

ZSPSVSFVTS SnEXPLIfWRP PGGHLS95HS LSHMIPATRT AKDONM iefi6 



20 



30 



50 



60 



75 



80 



Seq; ID NOi <!;22d Protein Sequence 
Protein Accsssion ft: 17F_0O3d05.1 



1 11 21 31 41 51 

I 1 1 I i I 

MFLSILVALC LWLHl*AU3VR GAPCEAVRIP MCHHMPWNIT RMPNHMHBT QKHAIIAIBQ 60 

25 YESLVDVNCS AVLRFFFCAH YAPXCTItEPIt HDPIKPCKSV CQRARDlXirEP LMKMYHHSWP 120 

ESLftCDSIfFV -TORBVCZSPB A1VT0UPECV KHIDITSIIMM VQERPIfWDC KRZiSPZSRCKC 180 

KECVKPTIATY LSKNYSYVZH AKIKKVQRfiQ CHBVTTWDV KBZFKS8SPI PRTOVPLXIIH 240 

SSOQCPHIIiP BQDVZiIMCyB HRSBHMIiLEir CLVEKHRDQL 8KRSXQNEER LQBQRRXVQD 300 

KKKTAGRTea SEIFPKPKGKP PAPKSASPKK NIKTRSAQICR TSPKBV 346 



9eq ZD HQ] C230 Protein Sequence 
Protein Accession #: NP_005931..1 



35 1 11 21 31 41 51 

1 I I 1 1 I 

HftPIAAtlLRBft AARALLPFMI* UrLLOPPPLL KRKUPVPVmi LHAHRRlGPQP miAAIiPSSPA 60 

PAPATQBAPR PASSLRPPRC GVPDPSDGX^S ARETSQKHFVL SGGRHElCim TlCRZLRPPWQ 120 

I>VQEQrURQfXM AEAUCVWSDV TFIOTTBVHE GRADXHXQPA RYHHGDDLFF DGPGGIMHA lao 

40 FPFKTHREGD VHFDVDBmT IGDDQSrXIIili QVAAHEfGHV LGLOHTTAAK AIjMSAFy*I?R 240 

yPIiSli6P!DDC RGVQHLyOQF WTVT&BTPA LGPOAGICTH EIAPItEPDAS PDACBASFDA 300 

VSTIRGEZ^F FKASFViraZA GGQIi(^?6!n?A £iftSBBHQ0UP SEVDAAFEDA QGHZHFFQGA 360 

QTOmDGSSSB VL6PAPLTBL CUVHFPVHAA LVH6PEKKKX YPFRGRDYHR FHPSTRRVDS 42D 

PVPRRATZSnR 0VPSBIE2AAF ODADSXAyFIi BGRIiiniKSDP VKVKAIiEQFP SZiVOFDFfGC 400 

45 AEPANTFZr 4B8 



Seq ID VOz C231 Protein Sequence 
Protein Accession #i ]ap_07692? 



1 II 21 31 41 51 

I I t I I ) 

KGBNDPPAVB APFSFSSIiFG LDDLSCZEFVA PDADAVAAQI LSLLPIiTPTP IXVlGllALl 60 

lAUUlGLGIH FDCSGinrRCR SSFKClSLIA RCDGVSDCKD GHDEZRCVRV GGQKIAVX>QVF 120 

TAASWKXWCS DDWKGHYAMV AGAQLGFPSY VSSDHIiSVSS LEGQPfflSBPV aiUHLLPDDK 160 

55 VTALBHSVYV REC3CASGHW TLQCTACSGHR RGYSSRXVGG MHSljZiSQHFtt QASLQPQCnH 240 

LO(9(9SVTTPL HIIXAASCVY DLYLPKSHTX QVGLVSLliDK PAPSHLVEKX VYHSKYKPKR 300 

TJamiiOMKb AfiPIiTEHBHX QFVCLmB&B HFEDGKVC^ SGiffGATEDGG DASPVUIEAA 360 

VPIflSHKZCH HRDWGGXXS PSHLCAGXCT OOTDSOQGDS OGPLVCQERB ZMXX>VSATSF 420 

OlGCSffiVMKP GWTRVT5PX> I3NIBEQ94BRD IrKT . 453 



&eq ICD NOs C232 Protein Sequence 
protein Accesslooi NP_0032il 



1 11 21 31 41 SI 

65 I I I I I 1 

MLHKLTDNIK YEDCBORBDO TSNOTARliPQ USrCVQQSPYT fihPPL&HTPN ADFQPPTPPP 60 

PyQP3YP09Q UBYSBfnSDVY SLNPZOAgPQ PQH^GKPGQR QSQESQI-IAT BRGLPHQLSG 120 

LDPRRDYSRH EDLLHGPHAI^ S5GU3DLSIH SLPHAIEEVP HVEDPGINIP DQTVXKKQPV ISO 

^SKSEHSNAV SAtPINKDNL FGGWXIPNBV FCSVPGRIfSL LSSTSTYKVT VKEV0RRL8P 240 

70 P&CUIASI»U3 GVLIOIAXSKH GGRSI«RHKIJ3 IQGUn^AGR RKAAHVTIiLT SLVEGSAVBL 300 

ABDFGYVCBT BFPAKAVftBP UntQHSOPNB QVXSKNMIiIia XXQICKBPTD L&AQDRSPLO 360 

2iI&RPBPII.EP GlQSdJCBBN LlBHBFaSFA VCAKVTALQH XUTBKUXJiMD XKnjBmSlif8 420 

BTDWHRKSSD KEEKHRK 437 



Seq ID MO: C233 Protein Sequence 
Protein Acces&ion #s tin?_002979.l 

1 11 21 31 41 51 

I I 1 t 1 I 

MKStAAAIkLV LVCTMMiCSC ftQTOTNKBLC CLVYTSWQIP QKFIVDYBET SPQCPKPGVI 60 
LLTKEGRQIC ADPKKKWVCJK YieDLKMIA 89 

9eq ID :N0s C234 Protein sequence 
Protein Accession #: hp 004054.1 
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1 XI 21 31 41 51 

1 i 1 I I 1 

HIIdQAHI>HSli CUiT^YLATG yGQBGKFSGP IiKPMTFSIYE GQEPSQIIFQ FKKMPPAVTP 60 

5 ELTGETDHIF VIERBQIiXaYY NRALDfiCTRS THNLQVAAIJ) AKGIIVB&PV PITIBVKDIH 120 

DtmPTFLQSK YEGSVROHSR TGKPFLYVSA rTDLDDPATfW CSQbYYQTVXQ UHINMVHYP LBO 

QIHNKTGAZ5 tiTSEGSQEUT FAHNPSYNIaV ISVKDMC9GQS BTSFSDTTSV DIIVTENIUK 240 

APKPVEHVEN STDPHPIKIT QVRWNDPQAQ YSLVDKEKLP RFPFSIPQBG DiyVTQPIiDR 300 

EEKDAYVPyA VAKDEyQKPL SYPIiSIKVKV KDniDNPPTC P8PVTVFEVQ ENKKLGfiieiG 360 

10 TLTAHDRQEIE NTANSFE^OYR IVBQTPKZ.FM DGLPLIQTYA GHLQIiAKQSL KKQDTFQyNI* 420 

TIEV5DEDFK tTLCFVglllVl DZMDQIPIFB KSDYGNUTLA EDTNIGSTXli TIQATmADEP 480 

FTGSSKILYH IIKGDSBGEO. GVDTDPimiT GYVIIKKPLD FBTAAVfiNXV FK?^EBIPEPI*V 540 

FGVKYNASSF AKPTLrVTDV NBAPQFSQHV PQAKVSEDVA IGTKVOflVTA KDPEGIHiay 600 

SEJiQDTRJiSWL KTDHVTGEIF SVAPIiDREAG SPYRVQWAT EVCSOSSLSSV SEFHLILMDV 660 

15 NCNPPaXAKD YTQIiPFCHPIi EAPG&LIFEA TDDDQHLFRG PHFTF&LGSG SLQNDHEITSK 720 

IHdTHARIiST RHTEFEEREy VVI1IRIIIDG& APPIiEGrVSI. PVTPCSCVE6 6CFSPAGHQT 7B0 

6IFTVGMAVO lU^TIAJVlG IILAWFIRI KKDR0KDMFVE SAQASEVKPI. RS 832 

Seq ID NOi C235 Protein Sequsace 
20 Protein Aeeeasioa #j NP_004434.1 

1 11 21 31 41 51 

I I I I 1 I 

MARAHPPPPP SPPPGLUI^Ii PPI.TiT.TiPLLL LPAGCRAIOSE TLHDTKWVTS ELAWTSHPBS 60 

25 GHEEVSGYDB AMNPXRrYQV QIVRES5Q»N HLRTQFZHBR DVQRVYVEI«K FTVESDCSISIP 120 

HXFGSCKETF I9LFyWU39D VA&ASSPEWt EKPYVKVDTI APDE5FSRLD AGRVNTKVBS IBO 

FGPLSiOSFy IiAFQDQGACM SLISVRAFYK KCASTTAGFA LFPBTLTQAE STSLVIAFG^ 240 

CZPIIAVBV6V PI)ia>YC6K^Q EWKVPVGACT CATGHEPAAK SSQCRPCPPG SYKAKQGBOP 300 

CLPCPPNSRT TSPAABICTC HNNFYRADSD SAtoSACTTVP SPPRGVISHV NETSIiILEMS 360 

30 BPRDLGGRDD LI.YHVICKKC HGAOGASACS RC33EKVEFVP RQL6LTERRV HISHUiABTR 420 

YTFEVQAV19G VS^CSPLPPR YAAVNXTIHQ AAP6EVPTLR XiH93SGSaZ*T LSHAPPERFN 480 

GVILDYEMKY FEKSEQIAST VtSQWSISVQl^ IX^RFDARYV VQVRARTVA6 YGOYSSPAEF 540 

ETTSERGaGA QQiLOEQ!:.PLI VGSATAGLVP WAVWIATV OJIICQHHGSD SBTCEKI^Y 600 

lAPGMKVYID PFTYEDPHEA VHSFAKEIDV SCVKIBEVIQ ASEFGEVCRG RLKQPGRREV €60 

35 FVAIXTLKVG YTBRQE£lDFL SBASIHOQFD SPUIIRLEGV VTKSRPWaii TEFKBSIICMJ3 720 

SFLRLNDGQF TVIQEiVQH&R GIAAGMKYLS BWYVHRDIA AlDlILVHeHL VCKVSnFOXiS 7 BO 

PSrCBODPaDP TYTBGLGGKI PIRWTAPEAI AYRKFTSASD TR9VOIVHHE VMSMSBRPYH 840 

Z3HENQDVINA VEQDYRIiFPP HDCPTALHQL MXiIX:hviidrn LRPKFSOIVH TLDKitXRNAA 900 

SLKVrASAQS GHSQPIiLDRT VPDYTTFTTV (XmU>AIKMS RyKBSFVSAG FftSFmiVAQM 960 

40 TAEOUjRIGV TXAGHQKK1I< gSIQDMRLQM NQTUPVQIT 998 

Seq ID KOi C336 Protein Sequence 
Proteia. Accession tt« IIP^001795.l 

45 1 11 21 31 41 51 

I t 1 I I i 

HYVQYVISKD SFVXP6PARP ASLGLGPANY QPPAFPPAFP OYPZyPSSYSK VBPAPAPFTA 60 

WGAPFPAPKD DttAAAYGPGP AAPAASPASIi AFGPPPDFSF VPAPPdPePG LLAQPLGGPG 130 

TPSSPGAQRP TPYEHMRHGV AAGGGGGSGK TRTKSKYRW yTDEQSI»H[jE KEFHYSRYIT 160 

50 IltRKSEU^ IdGLTERQ\na iCFQNRRAlCSR KVSKKKQQQQ QPFQPPMAHD ITATPAGPSL 240 

QGLCPSNTSZj IATS6PMPVK EBFLP 265 

Seq XD BOs C237 Protein Sequence 
Protein ItaceBelon #« NPjOSSBlB.l 

1 11 21 31 41 51 

i I I I I I 

MG^SDRARKESG GGPKDFG2U3D KYNSRHEBVN OlaEEBVBFDP VHNVKKVEKH GPGRITWIiAA 60 

- VJjTGI^VMj SIGFLVWHLQ YRDVRVQKVF NGYMRITNEH FVDAYEiraNS TEFVSLASKV 120 

QO KDAIiKI.I>ySG VPF1.GFYHKB SAVTAFSEQS VIAYYWSBFS IPQHLVBBAE BVKAEESWH 180 

1>PPAARSLKS FWTSWAFP XDBKTVORTQ DKSGSFGUIA RGVBUiQRFiT PG9TDSFYPA 240 

HKROC3«ALRa X»I>SVL6LTF RSFE^J^CSB ROfiDLVTVYN TLSEHEPEAI. \rQLOtSTYPP& 300 

YNLTFHSSQKr VIiIiXTItXTSIT BRRHPGFEAT FFQI^BMSSC OGRlAKAQGT FSISPYVF^^ 360 

PEHIDCTWMI EVPNNQHVKV RFKPFYM.BP GVPAGTCPKD YVEXNGEKTC GBEifiQFWTa 420 

65 HSNKITVRFH SDQSYTDT6F LAEYLSYPSS UPCPGQFTCR XGRCIRKBLR CDGmmClXS 480 

. SDECnCfiCDA GHQFTCKiaKF CKP UWCPS VESDOQDSSSDB QQCSCPAQTF RCSMGKCL^ 540 

SQQCNGKEfDC QDGSDEASCP KVNWTCTKH TYROiHQIjCL SKGSilPECDGK BDCSD09DEK 600 

tlC33GGLR8FT RQARWGGTO ADBOBRPHaV fiX^SOiGOGHI 0GA9I>I3PJ9W IiVSAAHCYID 660 

DRGF31Y3DPT QHTAFLGLHD Q&QR8APGVQ ERRI^ICRXXSH FFFNDFTFDY DIALIJSKiKKP 720 

70 AEY8SHVSPX CLPDASEVFP AQKAIWVTGH WSQXQOXGR. LILQKQSZitV ZUQTSXWhb 780 

PQQrrPRKMC VOFLSaaVDS CQGDaC»3PX>S SVBADGRIFg A9WSm3DGC AQRjKKPGVXT 840 

XtKPDFSDKIK BNTOT 855 

Seq ID MO: C23B Protein Sequence 
/5 Protein Accesaion #: Eos Hequexice 

1 11 21 31 41 51 

I I ] I 1 I 

MPPFLUiHAV CVPLF3RVPP SLPLQEVHVS KETIGKISAA SX^fMHC:aAAV OIMFI>I^G8N 60 

SO SVGKGfiFBRE XHFAITVCDG U>XSPERVRV GAFQFSSTPH LEFPLD8FST QaEVKARIKR 120 

HVFKGGBTfir ELAIiKYZ^Lnn. GLPGGSHASV PQIUIVTOa KSQOIVAI^S KQErKBROVTV IBD 

FAVGVRPEBN EBLUALAfiSP RGQpVIiLlkEQ VBDAlHSItFS niSSSAXCSS ATFDCRVBRH 240 

PCEHRTZ^EHV RBFAiQHAPCH RGSRSTLAVL AAHCFFYSHK KVFL1HPATC YRTTCPGPCD 300 

80PCQN6GTC VEBGLDGYQC ZXIPIiAFGGEA HCAI>KI>SI*EC RVDIiI.EU^8 SMSITLDGFIi 360 
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HAKVFVKRPV HAVLSBDSRA HVGVATYSRE IxLVAVPVQEY QDVPDLVWSL DGIPFHGGPT 420 

liTOSAIiRQAA CRGFQSATRT GQDRPRHVW IiLTESHSEDE VAGPAEIHARA PFIliTJ JiGVUS 480 

EAVRABLEEI TQSPKHVKVY SDPQDLPNQI PELQGKLCSR QRPGCRTQAi:* ni,VPMU>rSA S40 

aVSP^TFAQH OSFVR£CAIiQ FEVIyTPDVTQV GLWYGSOVQ TAFG£J3IKFT SAAMLRAISQ 600 

5 KEYU3CSVGSA GTALLKXYDK VMTVQRQARP GVPK2^WVLT GGRGABDAAV PAQKUSHHGI 660 

SVIiWSVQFV I*SBSLRRIiA& PRDSIjIKVAA YADLRYHQDV LXEHLOGEAK RPVNLCKPSP 720 

CHElEaSCVLQ HtSSmOOCKD G HEGPHC EMR FEJtSP 755 

- _ Seq HO 730= C339 Protein Sequence 
lU Protein Accession #e B09 sequence 

1 11 21 31 41 51 

I I I I 1 I 

MPPFLIjI>EAV CVFLF9RVPP SLPLQBVHVG' KETIGKISAA &KMMWC8AAV DIHFLLD6SN 60 

15 SVGRGSPERS KHFAITVCDG LDISPEBCVRV GAFQPS3TPH LEEPLDSFST QQBVKARIKH 120 

MVFKGGRTET ELAIJCYLIiHR GLFGGRNASV PQILIIVTD6 KSQGDVALPS KQLKERGVTrV 130 

PAVGVRFPRff EELUALASSP ROQHVLIiAEQ VSDATNGLF9 TL35SAIC5S ATPDCRS^flAH 240 

PCEHRTLHW REFACTIAFCK RSSHRTUWl* AAHCEPVSHK RVPLTEPATC TCRTTCP6PCD 300 

^ eQPCQHOGTC VPBGLDGYQC LCPLRFGGEA NCAWCLSIiEC KUDLLPLLDS SAI3TTLD6FL 360 

20 RAKVFVKRFV RAVItSKDSRA RVGVATYSRE IiIjVAVFVGEY QDVPDIjVWSIi DGIPFRGGPT 420 

LTG8ALSQAA ERGFGSATRT QQDRPRRVW IrLTSSESEDB VAGPASEARA RELlHilAJVQS 4Q0 

BAVHAELEEI TGSPKKVMVY SDPQDIiFNQI FELQSKLC6R QRPGCRTQAL DLVFMI1DT8A 540 

6VGPBNPA<JM QSFVR8CALQ FEVNPDVTQV GJ.WTO9QVQ TAFGIOTKPT RAAHLRAISQ 600 

APyiiGQVGaA GTAIimiYDK VHTVQRGARP GVPKAWVLT GGRGAEDAAV PAQKDRZmGI 660 

25 6VI1WGVGFV LSBGLRSIiAG PRDSI<IHVAA yADIAYBaDV IflEHIiOG&AK QFVl)IiCeCP6P 720 

Ca4HBGSCVI^ KSSIOLCSCXD GMECSPHCBISR ENSSCSVCV8 QGNJXiBTPIiR HHAFVQBC39S 780 

RTPPSNTREG LGTEMVPTFW NVCAP0P 807 

B©<j ID MOi C240 Protein Sequence 
30 Protein AcceBsion #$ XP_0973a6.a 

1 11 21 31 41 51 

I I I I I I' 

MPKSEPIjGCI. SPAfiHAPGfiA AATXSAWIjPAA SGGPGPLGPP CTCPPESMR ORAQSHAaSS fiO 

35 P9GCVCVSGI TjRWSVGDPA SSRWVDEASH SEDLSLLLIP HIVGTOGVGG GHARGHVPAQ 120 

BKBVAEGGGH AGSGNGRRLQ RWSARSGflli GRKPCXiQRXiEi PASOGPVQPQ FCFSPATACR IBO 

WGFKFGVAFI7 GRAQHFTZiGR U3GGRAVPSa. TRTLD6P 217 

Seq ID NO? CZ41 Protein Sequence 
40 Protein Accessioaa CAC03433 

1 11 2X 31 41 51 

111111 

MIjSSTDFTPA SWELWRVDK PISIEBQQKDVT LRVSGDIiHVG GVHI1KI.VBQI NJSQDMSDFA GO 

45 LWWBQKHCWL LKTKWTLDKY GVQADAKDI*F TPQHKMLRLR I.PNI1KMVRI1R VSP9AWFKA 120 

V^ICKIUII RRSEBIfSZiIiK PSGDYFKKKK. KKDKHNKEPI lEDILNIiESS PTA3GSSVSP IBO 

GWfSKTMTPI TOPINGrPAS STMTWFSDSP IOT:QBICgiIA FSQPPQaPEA URDKCQPRSL 240 

VDKAKLKAGH liDSSRSLMEQ GIQBDEQLLL RFKYYBFFDL UPKHJAVRIN QLYEOARWAX 300 

E^IiEBIDCTEE l^IFAAIiOr HISKI>SL9AE TQDFK3ESBV DEIBAAliSSIIi BVTLEGGKAD 360 

50 SIiLEDITDXP KLADHIiRIiFR PKKIiXJ?KAFK QYHFIFKDT8 JAYFSmSLE QGEPIiEEI29L 420 

RiGCEWPirVN VAGRKFGXICL LIPVADGHHB MYXiBCDHENQ "XAOHMAACM;!! ASKGKimDS 480 

GYQPSVLniL SPLRMKHBNE A&QVAGSLEH 14DHHPECFVS FRCAKKBKSK QIAARlIiEAH 540 

QUVAQMPLVE AKLRPZQAHQ SLPBPGI/nY Z1VRFKG6KKD DILGV&^OKRL IKIDAAT6IP 600 

VTTWRETKIIC QMNVIMBTSQ WIBEDQNVF TA7TCLSKDC KXVHBYXGGX EFLSTRSXDQ 660 

55 £1BTU7EDLFH KLTOG^D 677 



Set! TO MO: C242 TiW^ Gequeoca 
tlUCleic Acid Acceseion #s lliH_005170 
OO Qadlng sequence: 337.. 918 

1 11 31 31 41 SI 

I I 1 t I I 

^ OaaOSTCnOA AAOGGOACaa CXXSGOGOSCSG GAGSUSSGxr ATCIftXACAT TTAAA2UU:CA 60 

65 GGOSOCIGOS OOSGGGCTGC 6®\GAOCTGG GAOAOrCOGG OC3GCAOGCS3C GGGAQUXaMS 120 

caarcccftCGc tcoctqgcgc gtaoggcctg ccaccrctag gcctcctatc ccogggctcc ibo 

AGACGftCCTA GGACGCGTGC CCTOGGGAGr TGCCTGGOGQ CGCCGTGCCA OAAQCCCXCT 240 

TOGQGOQCSCA CAGTTTTCCC CX5TCX3CCTCJC GCTTCXTTCTG CCXGCACCTT CCTGOQQOGC 300 

GCGOGGftCCT 6GAGCGOGCG GGTGGATGCA GOCGOGATGG AOGGOOGCftC ACTGCCCAGG 360 

TO TCCGOGCCCC: CTGOQCCX3CC CGTCCCrGTC GGCTOCGCTQ CCCGGOGGUVS ACCCX30QXGC 420 

OCGGRACTGT TGCGGTGCAG CCX3GCGGCGG CGACX3GGCCA CCGCAGAGAC CGGAGGOSGC 480 

QCAGCXSGCCG TA£3CX3CC3GCG CAATGAGCGC GAGCGCAACC GCGrGAAGCT GGTGAACT3X3 540 

GGCTTCCAGG CGCTGOaGGA GCACGXGCGG CAGGGGGGCG GCAGCMU3AA GCQGAGCAAG 600 

GTGGAG2U3GC TOOGCTCAGC OGTGGAQTAC ATCOSCGCQC TOGSkSCSGCCT GCTGGCCGAG €60 

CAGGACGCCX3 TGCGCAACGC GCTGGGGGGA GGGCTGAGGC GGCAGGOGGT GOSSCGGTCT 720 

GCGOCCGGCX5 GGCCGCCAGG GACCACCCCG GTCCCOGCCT COCCCTCCCG CGCTTCTTCG 780 

TCCCEOGGCC GCGGGGGCAG CTOSGAGCCC GGCTCCCCGC GTTCOSCCTA CTCGTCGGAC 840 

GACAGOGGCT GCGAAGGCQC GCZGAQTCCI GCGGA6GGCX3 AGCTACTOGA CTICTCCAGC 900 

TOSTTAaGOG GCTACXGAGC eCCCTOGACC TA 932 



75 
80 



eeq 3D BOs C343 Protein sequence 
Protein Accession #1 NP 060333.1 



21 31 41 51 

1341 



wo 03/042661 



PCTAIS02/36810 
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MSGGHQLQIiA 
LNIrTLBGVFA 
SIASKARMAG 
KAmTRIELKE 
SQU^TRRYQA 
PHIiHQHRTCP 
P5RSAVARPP 
liQSTSQHPAA 
NCTDISXjQGV 



ALWPWLIiMAT 
GVAEITPAEf3 
ERQASAVI*FD 
PPAWSDYDVW 



LCVFHITBGD 

RPGPFLPSQE 
CPVPLRRiiRP 
HGSSSTFCSS 



SH5AAFSGRL S1IPQCPRKI.P 
85PTPGSRPQ DATWHPACQI 
HSQFVHLCIiT E«QPLBPHPP 



LQAGFGRTGL 
KLKQSKPLYL 
rTEDRAAABQ 
ILMTWGTIF 
DSGSSCSSAP 
&PSQSU3FSR 
PGMGPRHHRF 
FDSSGSGESY 
LSSDPDPIjVY 
RFQHlffiRlCPG 
BPAPOPVDAS 
FFHXTPSVAY 



I 

VIiAAAVE&ER 

LQQPLGliTWP 
VIII»ASVIiRI 

SYQEPGRRLH 
PRAAHPRAPG 
CTERSGYIiRD 
CSPK6DPQRV 
PBTGVPQSEtP 
SiCP&rSSLP 
PHSFBAEPIiT 
TAEGRFCPyp 



1 

SAEQKAVIRV 
PGFISrVKLE 
VVLIWGB9DAE 
RCRPRHSRPD 
EOQBUIVISC 
LIRQHPC3IAH 
EQQRIAGAQH 
GPA6D8SSGP 
DMQP3VTSRP 
FIFRTQPQPE 

HCQVEiSAQPG 



JFIiKMDFTGK 
SPRBAPRPCt 
KLMBEVYKHQ 
PLQQRTAHAI 
XiHZFHRNCVD 
YHIiPAAYItLG 
FYAQGWGMSH 
GHGSSSDSW 
RSIiDfiWPTG 
PPSPDQQVTG 
HFQRKRRGGP 
LPETFGPCYS 



60 

120 

180 

240 

30D 

360 

420 

480 

540 

600 

660 

720 

7B0 

783 



Sell NOt C244 UNA S^quexuse 
Nucleic Acid. Accession KH_00428d 
Coding seciusncei 493.. 1695 

1 11 21 31 41 SI 

i I 1 I ] I 

GC3CGC0GCCT CXHTCCACCQG AGGftflCCGGC GCCAGCGTGG ACOGCGGCAG CCAGGCTGTG 
CAGGGGGGC3G GOGOGGAOOC CXX3ROCm3CT CGQRGrGGCC C3CTTQGAOGC 
GAGAAOGCAC COBCaOAACC GAOGGCTCAG GTGCX3GGAGQ CTGGCGGATG 
GAGAATOGGG TACTAAGAGA AAAQCAQGAA GCTGTGGATC ATAGTTCCCA 
AATGAM3AAA GOGTGTCAISC CCAGAAOGAG AACTC3U3^ AGCAGAATGA 
AACAAAATAG CAGAGAAACC TGACrGGGAS GCAGAAAAGA CCACTQAATC 
AaACRTCTOA AT6GGACAGA TACTTCXTTC TCTCTGGfiaS ACTTATTCCA 
TCACAGGCTG AAAATTCACT GCUVSGGCATC TCATTOGGAG ATATTCCTCT 
ATCAGTOATG GCATGAATTC TTCAGCACAT mTCATGTAA ACTTCAGCCA 
CAGGATGTGA ATCTTCATQA GOCCRTCTTG CTTTGTCCCA ACAATACATT 
CCAACAGOVA GGACTTCACA GTCACAAGAA CCATTTCTGC AGTTAAATTC 
AATCCTGAGC AAACCCTTCC TOOAACTAAT TTGACAGGAT TTCTTTCRCC 
CATATQAGOA ATCTAACAAG CCAAGACCTA CTGTATGACC TTGACATAAA 
GAGATAAftCI TAATQTCATT aGOCAC!2USA\ GACftACTTTG ATCCAATGGA 

ctttttgato aaccagattc tgattctgqc ctttctttag attcaagtca 
tctgtcatca agtctaattc ctctcactct 6tgtgtgatg aagotgctrt 

ACTOACCATG AATCTAGTTC OCATCATGAC TTAaRAOGTG CTGTAGGTGG 
GAACCCAGTA AGCTTTCTTCA CTXGGATCAA AGTGATTCTG ATTTCCATGG 
TTTCftACACG TATTTCATAA CCACACTTAC CACTTACAQC CAACIGCACC 
TCTGAAGCTT TTCGGTGGCC TOaQMbGlCA CftGAAGI^TAA GOAGTAGATA 
ACAGATAOAA ACITGAGCOG TGATGAACAG OGTGCTAAAG CTTTGCATAT 
GTAGAIGAAA TTGIOGGCAT GCCTSTTGAT TCTTTCAATA GGATCTTTAAG 
CXGAOVGACC TACAAOTCTC AC^TXATCCXST GACATCRGAC GAAOnGGOAA 
GCTGOSCRGA ACTGTCGTAA AOGCAAATTO GACATAATTT TGAATTTAGA 
TGfTAACTTGC AAQCAAAGAA GGAAACXCIT AAGAGAGA0C AAGCACAATG 
ATTAAC&TAA. TGAAACAGRA ACTOCATOAC CTTOlftTCATG ATATTTTTAG 
QATOACCAAa GTAQGCCStfSr CaATCCCAAC CACrTATGCTC TCCAGTGTAC 
AGTATCTTGA TAGTACCCAA AGAACTQGTG GCCnCCftGGCC ACAAAAAGGA 
GGAAAGAOAA AGTGAGAAGA AACTGAAGAT GGACTCTATT ATGTGAAGTA 
GAAACTGATT ATTTGQATCA GAAA(3CATVG AAACTGCTTC AAiGAATTOTA 
CTGC23LCrrT6 AATAACTCAfi TTAAOSCTOT TTTGAAGCTT ACATGGACAA 
CTTCAAGATC ACACrTGTGG GCAATCrrGQG OGAGGCACAA CTTTTCATGA 
ATACAAftATT CATAGTTATG TGCAAAOAAT AOOTTAACAT GAAAACGCAG 
CKSQKTQXKA. GCCZCTOCTTT TTAAGAGXAA GTTGGTTACT TCAAAARGAS 
GBKICKMOT A9TT3AAGAG GTATTTCAQT TTTAAATQCA AAAaCAGGCTI 
AGTTTCTTMl CACISkTACTO AQCTTTTCAA ACACTATTTT AATCTTTATA 
AAATTTTGCr TTGT 

8e<x ID 390 s C24S Pxotieln Sequence 
protein AsceBBion tfs NP 004433 



TGCGAGOGA9 
QCATQAGGAA 
TGATGATGAA 
TAGAAATGAG 
GTTQCTTTCA 
TCCAGGCAGT 
GGCTATAAlSr 
TAGAAGAGAT 
TCATACCSiCC 
GGTTGACAAT 
TATAITTGAT 
TGTITCTCAG 
CAATAATACC 
AGGXTATTGC 
CTACTACCCA 
AGATCTTACA 
AGAATCTACT 
CCrrTGAAGAG 
CCCTTTTTCT 
TAGAXATTAT 
AAATAAAGTT 
AGATOATGTA 
TAACAAAGCT 
TAGATTAAQA 
CCATGATGGA 
AACCCIAAAAG 
GTAATGTTCA 
TCTTTAAGOJA 
ATGTTTAGGA 
AfilGCATTGT 
TAAGACTTTC 
CAAACACTGG 
ATTTTCATTT 
TTTAACTTAT 



MALRSLGAAJb 
VCNVPES8QN 
DSDVrjETFESm 
YGGCMSLIAV 
CNGDGEHIiVP 
THCVCRHGXy 
ICKSCGSGRO 
3PQFASVNXT 
YNATAIKSPT 
LPliXiGSSAA 
FTYEDPNEAV 
TEKQREDPLS 
VIQ^VGHIiRG 
YTSALGGKIP 
EQDYRLPPPM 
IHLPLLDRTI 
lAGaQKKXUff 



11 
I 

IjLLPLLAAVB 
NWUKTKFXRR 
MEKPWVKVDT 
RVFYKKCPRI 
IGRCMCKAlGF 
RADLDPU^MP 

TNQAAPSAV8 
KTVTVQGIiKA 
GIATFLIAWV 
RSPAKBIDXS 
EA&IM3QFDH 
lAAGMKYIiAD 
IRHTAPBAIQ 
DCPSAUIQLH 
PDYTSFMTVD 
SIOVHRAOMN 



21 
1 

ETIiHDSTTAT 
RGAHRXHVEM 
lAADBSFSOV 
IQEIGAIFQET 
EflVlSffGTVCB. 
CTTXP5APQA 
YAPRQXiGLTB 
IMHQVSRTVD 
OAIYVFQVHA 
lATVCHRRRG 
CVKISQVIGA 
PKVIHLEOVV 

KimnmDikAA 

YRKFTSASDV 
LDCWQKDRHH 
SnitBAXXMOQ 
QIQSVBV 



31 
1 

ABLGHHVHPP 
KF8VSDCSSI 
DXlG^RV^QCIH 
L&GABSTBLV 
GCPSGTFKAH 
VZfiSVDETSL 
PRXYISDLIiA 
SlTljfiWSQPD 

PERADSE5fTD 
GBPGBVCSGH 
TKSTFVMIIT 
RinLWSlII.V 
VfSYGrVMHSV 
RPKFGQIVHT 
YKESFANAGF 



41 
i 

SGHEEVSGYD 
PSVPGaCKET 
TEVRSFGPVS 
AARGSCXANA 
QGDSACTHCP 
MLBHTPPRDS 
ITTQICTPEIQA 
QPtSGVIIiDYB 
GKKYFQfFMTE 
KLQHYTSGHM 
UCI.PGKSE1F 
EFMENG3LDS 
C3CVSDFGLSR 
MSYGERPYHD 
IiDKMlRHPEIS 
TSFP W SqMM 



51 
I 

ENMNTIRTSQ 
F«n*YYYEADF 
RSGFyiiAFCID 
EEVDVPIKLY 
IHSRTT8BGA 
GGSEDIaVYHl 
VHGVTDQSPF 
liQYYBKEIiSfi 
ABYQTSIQBK 
TFGMKIYIDP 
VAIKTTjKSGY 
FIiRQSnXSOFT 
FLEDDTSDPT 
KTHQDVINAl 
JjSnMAPItSSG 
MBDXLRVGLT 



60 

120 

IBO 

240 

30D 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

loao 

1140 
1300 
X260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 

laeo 

1920 
19B0 
2040 
2100 
2160 
2174 



60 

120 

IBO 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

90D 

960 

9B7 



Seq ID BOt C24fi ProteiiL Sequence 



1342 



wo 03/042661 



PToteia Accessjlon it; KF_114l4a.l 

1 11 21 31 41 51 

I } I 1 1 [ 

KDASSVFQKD lAVKKKIiXKF RYVKLISMBT SG6SDDSCD8 FASI3KFANTR LQSVEEGCRT 60 

SGQCSHeGPL KVAMKFPARS TSjSATKKXAB SRQPSENSVT DSHSDSEDSS (3HNFU3KaAIj 120 

NIIOQHKAMIA KUfSELESFF GSFRGRHPLP GSDGQSRSPll RRTFPtJKftfiR SHPERRARFL 180 

TR£KSRlIiQ3 1J>7VZ>PHBEBE EEDK^MLVRK RKTVOGYMHE DOr<PRSRR6R eSVTLPRIIR 240 

PVEEITBG6V GERLQQPSICR RYITVHWAI*!. VINAVRRIiIiI PKQTABTQTA OAPEASSVAF 300 

AFETVHVSK5 GHIiCHIRTGX ARLVEBSATA VSACSSEHDGV RLG^LCI 347 

Seq IB NOs C247 Protein Sequexice 
Protein Accession #s NP_03S577.1 

1 11 21 31 41 51 

1 I 1 I I I 

MBNPfiPAAAL 6KAZ.Cftt<ItI<A TI-GAAfiOPLS 6BS1CSASAP AKTSXTPX6K H8QTAFPRQY 60 

PIiFRPPAjOHS GliLjG£AAH&&D YSMWRKNQYV SNGLRDFASR GBAKAIiMKBl BAAGBAXjOSV 120 

HKVPSAPAVP SGTGQTSAEI* BVQRRHSI»VS FWRIVPSPD WFVOVDSLDL CDGDRWREQA 130 

AL^DLYPYDAO TDSGFTFfiSP KFATIPQDIV TEXTSSSPSH PA^SFYYFBL KAIiPPlARVT 240 

ZiVlUillQSFRA FIPPAFVLPS SDBEBIVDSAS VPBTSUXSV SIi»3Sm3LCG GHGGRLSTKS 300 

RTRYVBVQPA HHQSPCPBItB BEKBCmSNC V 331 

Seq ID iro* C248 Protein seqijience 
Proceln Accession #f »p_D63947.1 

1 11 21 31 41 51 

' I ' 1 ' ' 

MLQDPDSDQP IiNSLDVKPlA KPRIPMETFR KtfOIPIIIAli LSLASIIIW VI»rKVIliDKY 60 

YFIjiOGQPUIF IPRKQLCD6B IdX;PIiaED£& HCVKSFPEGP AVAVRIiSKEKR SmiQfVZiDSAT 120 

GSmFSACEDN FTEftlABXAC XQjnSYSSKPT FRAVEieSDQ OtJ>VVEXTEK SQEUOQIHSS IBO 

OPCLSGSLVS LHCXAOGKSIt KTPEWGGEE ABVDSWFViaV- SIQfyDKQHVC GGSILDFHRV 240 

LTAAHCFRKH TDVFNHKVRA GSDKLGSFPS TAVAKIIIIB EBrPMYPKDWD lAIMKLQFPD 300 

TFSQTVHPIC IiPFFDBBLTP ATPXiKIlOWG FTJKQNGGKMS DILLQASVOV IDSTRCNADD 360 

AYQ6EVTERM MCZAQIPBOtaV DTCQGD8GGP liMyQSDQHHV VGIVSHGYGC GGPSTPOVYT 420 

KVSnyiiNHXY NVnXABEi 437 

Seq ID NOs C249 Protein Sequence 
Protein Accession 4Ib HP_003036.1 

1 11 21 31 41 51 

I i 1 I I I 

MGCKVLLNXO QQMIiRSKyVD C!GREBTRI>SR CLNTFDIiVAZi OVGStCIiGAGV YV1A5AVARB 60 

IZAGFAIVI&F LIAALASVLA GIXYGEFQAR VFKTG&AXIiY SYVTVQELMA FITGHNLILS 120 

YIIGTSSVAR AMSATFDEIiI QRPrGEF&a*r HMTLIilAPGVI* AEEIFDZFAVX IILIIiTGLLT IBO 

lOVKBSAKVH KXFTCXKVLV LGFIMVSGFV 1QQSViaZH<^T EBDFGBIT9GR liCUINDTiCEG 240 

KPGVGSFMTF GFSGVLSGML TCFYAFVGPD CIATIGEEVK 1VPQKAIPVGI VASLLXCFIA 300 

YFGVaAAIiTL MMSYSCUSSttf SPCPDAFKEV GHBOAXYAVA VG5LCALSAS UASHFFMPR 3€0 

VIYAMAEDGZj 3biFKFI»ANVllD RTRTPllATL ASGAV3WVMA FIiFDliKDliVD IiMSIGTIjLAY 420 

SlAnUkCVIA/Ii SY^EQPHLV YQHAST5DEL DPADQfelBIA8 IHDSQXiGFLP EAEKFSLREI 4B0 

IiSPKIMSPSK ZSGILIVZnST SLIAVLIITF CIVTVIiGREA XiTKBAUIKVF IdMBKAiC^ 540 

WTGVinSQP £8KIKLBFKV PFIiPVIiPIZiS TFVSVyLHHQ liDQGTHVKFA WiMLIGPIIY €00 

PGYGLMHSEE ASLDADQART PDGKLDQCK ^29 

Seq ID SDi C250 Pxotein Sequence 
Protein Accession #: 1IPJDD2767.1 

I 11 21 31 41 SI 

I I I I I I 

KRAFHUHiSA ASGARAIAKIi laPZiUmQLIVA ASAAUrPCB9D TRU>PBAXGA PCASGSQPNQ 60 

VSUlOGL&Sa CAGVLVDQSn VLTAKROQHK PIMHRViaiDE LIiItLQGBQIiR RTTRSWHFK 12D 

YBOGSGPH^ RSTDEBDIML IiKIARPVTPG PKVRAXiQLFY RCA0PGDQC3Q VAGMGTTAAR ISO 

RVKnnOSLTC &SITZL8FKB CEVFIEPGVVT IDDOICAGUSR GQDPGQSDSG GPXiVC!D&TLQ 240 

GILSHGTYPC GSAOHPAVYT QXCSCXKBWIBr KVXSSH 27S 

Seq XD MOs C351 Protein Bequence 
Protein AccesBion ffs xP_D95<iea.3 

1 11 21 31 41 51 

1 i I I I I 

MlRAATSkBFG KV5PASPAK6 TASLPRAFIiO SLRTLIAXXiD DHCStGCVRIA BIQSLWVBAH £0 
BLP8GVLEGL 8QRRGPQPQA AVRSRRGGAV PRGARAVPER CRGTBXRRG& BCSGLQRLGG 120 
GFRGCPAOPC ARGEHRRBrX T869D0GIiIiK QMKELEQEKE VLIiQGUEWU. QQRDHYQQQtj 180 
QCfVQERQCRI. GQSItASADPQ AVOSPRPIiGR LLPKVOEVAR HLQEUAEAC AGRAZ.FTSSS 240 
GPPCSAIiTST S5PGWQQQIX UfLKEQHRUi TQEVTBKSER ITQLGQJCSAIi XIDQIiFBARAli 300 
SQQDGGLSPA GPEIKPIiT»7 RWrbXttAGA IJ>8PHSPQI.X> UPIaSnOSGGP XJOBLEDTHFP 36D 
AVI.I.KVPSEa XRTSIHHRliHP HQ^PABOnNQ liGCGAEAAPE TGGTLSHFES HKTTCTSDSL 420 
GGPCPQBGDR SWSHI/JftAFD VAPAVA2CVTP HRSDAAGSRH CT3ICELCEKG LLTFRDIAIB 480 
FSIAEHQCliD HAQQ^YRDV KLEETYRNIiFS LGMTVSKPDL lACLEQKKEP QNIKRllBKAA 540 
XHPVTCSa^ QDIdQPEQSIK BSliQKVIPRT YBIQOaHSZLQ UOCOCKKVDE CEVBKGGSHD £00 
Xia<2CL6in?QN KZSrQXHKCVK VFSKFGIISNR HEORYTGEEKH LKOEKYOKSF CHFSHLKQHQ 660 
XIBTKEKSYK CBBOGBSRaH SSfiGT TOKR I lOGEKpYRCB BCGKAFRWPS finUTSHKRIHT 720 
GEKPYACBBC GQAERRSSTX^ TXUHKRIHTGE RFTKCSEC9GK AFSVSSAIiZY HKRXHTGEKP 780 
YTCBBOBKAF NCSSTLKTHK IIKTOBRPYT CBBCGRTFNC S5TVKABKRZ mWKSYSCS AW 

1343 



wo 03/042661 



PCT/US02/36810 
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BCDKAPKHHS 
GWIiVRHKHBH 
QRNTGVGGLl. 
CLiEQKKEPWN 
EVQKGGCNEV 
MVSQUIQHQZ 

KRIHTGEKP7 



eLAKHKIlHT 

TFRDWIBFS 
ZKRNEHVTKH 
NQCLSTTQaK 
THTSENSYQC 
EKPYTCEECQ 
TCEEC3GKAFN 
CDKAFKNB8S 



GBKEYKCSDS 
VRFNRI>?SH6 

PDI.PPBI1GIK 
IPQTHKCVKV 



KAIiAKSSEVO 

QQKLYBDVML 
DSLQKVIFRR 



KVYSGDOEHG 
QRKEEPDLQN 
EmfRNliVSIiG 
YGKSGHDNLQ 



QAPSRSSTIA 
CSSTLKKHKI 
liANHKBMHTG 



aiLSK HKRIH 
HBKRZHTGEK 
IHTGBKPyKC 
EKFYXCE 



TGEKFYRCEE 
PTfTCESOGKA 



IRVHKKKETQ 
HYDHQNALED 
lAVSKPDilT 
VKTCKSMGBC 
KCKKYOKSFC 
CX3KAFTWSeT 
FStfSSSIiTYH 
STLNTHKRIH 



Seq ID NO: C252 PTOteln Sequence 
Protein Accegslon #s NP 114433-1 



^ ^ 1 11 21 31 4X 51 

15 I I I I ! 1 

MAGftSMItLUf XjIjSCUVKTGV LGDIIMRVSC APGflPyHKGH CYGYTOTOtrBN HSDASLSGQ8 
YGNGAHIiAST I.SIJCEABTIA BTISOYORSQ PXWIGCHDFQ KSiQQinQHXDG AMYLTRSHSa 
KSMGGNKHCA EM98NNNFI>T HS&NBOTKRQ HFIiCmTRP 



Seq ID NOs C2S3 Protein Sequence 
Protein Acceasicn fts XP 0518G0.2 



1 

I 

MDGVEIKiSTEV 

DHRAEVQLlaS 
HM6QQLV0QY 
■raaiiGHCFFT 

OCHAVSTFWH 



AYKEiaDBGAH 
GTQOOINRZK 
liAFRU^WQ 
SRYPOSYIiTK 
HPLYI^BGALT 

NBREKPAPC3 
TKDHFLHVKM 
NSILQ6IFWQ 
QFRSSERPIN 



11 
I 

VYKKOQDYRP 
GDTIiVZASTD 
RHIIVMGBKES 

SDGPEERNTP 
ABPUHHIjIHC 
GMIIDKGVKT 
LRGQDVWLDS 
GPGGLDE9GK 
SCPHNEnrTGI 
UDNWLVRHPD 
RSTHYQQYQP 
VHBRLIiKQTS 
MKGCBSIKIK 



IiTNYVAXlFD 
VTLDTEDHKA 



21 
I 

AOfDSGRACR 
y&HKQAEBFQ 
DKOfPYSNHI 
DERGGYDPPT 
DECLGLtiVKS 
AAM3SBETBP 
TBASAKDKRP 
CRFADHBlGli 
TIiPIGQKfPI 
AFEDVPITSR 
CINVPDHRGA 
WTLQKQYTI 
KTQVFVftTIiQ 
ALIPKKBGVfl 
TOSnOFAYlEVD 
SI8IVLKASKG 
KIPCJWPIPV 



31 
I 

SYRVRFLCGK 
VLPOtSCAPN 
GHFFDFDTFG 
YZSDLSIHHT 
GTIiLPSDRDS 
HFZFHHVPTG 
FXiSIIflARYS 
TLA50GTPFY 
RGIQIiYDGPI 
VFPG£PGPW? 
ICSGCYAQHY 
HWDQTAPAEb 
MDKVEQSYFS 
DCTATAYPIO^ 
GKKYPSSEDG 
RYVSSGPHTR 



41 

I 

PVRPKl/rVTI 
QVKVAOKPHY 
GHTKFALGFK 
FSRCVTVHGG 
KMCIWITQDa 
PSVGHY&SGY 
BHQDADFUKP 
DD05KQE1KN 
NIOHCTPR-KP 
NQIiDHEXJDKr 
IQAYKTSNLR 
AIWZirNSHXG 
RBHYTWIEDS 

teeawdvpm 
iQvwnx^ 

VLCKIjGADRG 



51 

I 

DTNVNSTirja 
L&I6EBIDGV 



NGLLITOWG 
YPGYIPKPRQ 
8EHIPLGK7Y 
RESAXIEHFX 
SIsFVOBSGMV 
VALBORHTfiA 
SVEHDVDGSV 
MKZIKNDFPS 
DHIimLCYF 
QLLPIiKIiKAQ 
PKKLF6SQI1K 
GRW£HT8PR 
LKLKEQMAFV 



Seq ZD NO: C254 Fxotein Sequence 
Protein Accession (ts »P_05518e.l 

1 IX 21 31 41 51 

I I 1 I I I 

MTALSSENC8 FQYQIiRQTUQ PLDVHYLLFL IILGKIIiLHI LTLGKRRKHT 0QNFZ4EYFCI 
SLAFVDLIil* WISIILYFR DFVLItSIRFT KYHICLFTQZ ISFTYGFLHY PVPLTACXDY 
CLMFSOTKIji SFKCQKliFYF FXVII>IWXSV EAYVLGDPAX YOSLTCAQHAY 8KRCPFYV8I 
OSYHZiSFFHV HIIiFVAFITC HEEVTTXaVQA lEXTSYHHBT ZttVFPFSSHS STTVRSKKZF 
LSKLIVCFLS THZ.PFVLIiQV ZIVLLKVQZP AYIEMHIFKL YFVSSSFIiIAT VYNENCHXLH 
ZilCDZOIrPLDP PVNHKCCFZP l^ZPHItEQIE KPZ8XHXC 

Seq ID NOs C255 Protein fieqaence 
Protein Accesaion #: Bob sequence 



900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1357 



60 

120 

15B 



SO 

120 

IBO 

240 

30Q 

3fiO 

420 

480 

540 

600 

660 

720 

780 

B40 

90O 

960 

996 



60 

12D 

IBQ 

240 

300 

338 



vaimJoaisLL 

IFLYWQPKD 
FOfCRCCKKC 

srkladshfr: 

IPVLDBIKSM 
BCPSSETOISI 
SVOSOTTTW 
YDSYWHLOSli 
GFLF<:QnLMX 
SKHKUrFBQV 
LGAAQaKHIQ 
lOCRDftQTZKT 



11 
I 

UsLGGNSFSG 



21 
1 



RVVSZSHFVM 



DI/HTLUKETP 
ATAIEETKEA 
RIiSLSQl^SH 
AGIXRVZiNSX 
VICSliTIiXV 
IWLTFVFOA 
Y8DCKXSIRGT 
DFAAOGIDRH 
IHQQRVIiFXB 
KKYGRirZIGY 
XATVFL1<PAX> 
TSPSOB 



QKAYBSKXDY 
NQPFIAKCFA 
BQIKYIIiAQrY 
liEDJHHfiTljKS 
PELRQIjPFVD 
CSfiDXDHVTQR 
IFYYI^GIiLCG 
HVEKLXCEPY 
YGTLHLQNSF 
2IYP8Y1A0T6 
QSLSTLYQSir 
FEHYXtQHlBF 
ZFAVKZAKYy 



31 
I 

AWHYELFATir 
DKZVYYBAGI 
ISLL-VICJXX 
HTTKDKAFTD 

AELDHVNNVIt 
LPIQDZLSAF 
VC6YDRHATP 

TSKE)UPRVU7 
NJ8EHLN1NE 
KSPAGVKIiXiS 



41 
I 



SBHDSEE)VrD 



ILCCTOJI.I.P 
SXGXFY6FVA 
htSSTSlSVUGG 
SIiTSVKTSLR 
RTDXiDOLVQO 
SVYVEUTESY 
TXRBC7GNT6 
TPYLLNKDWK 
HTGSXSSBLB 
FAYDliEftKAH 
ZtBRVTRXIAS 
FVATAliDTAV 
DVBTIFKKHH 



51 
I 

PIGHiFELVH 
XXLMPI.VGYF 
gHQVRTSXECR 
GILIXSXrRFNX 
33U3DFLCLV 
GYQSLNPXFD 



GVFLMTOVt3Li 
TYIiSGKLFBJK 
SI>KVm»NXFIi 



XiDFliQNFITH 
DVFI.CSY1ID 
BHGnSfGYHKD 



Geq ZD NOs cajsfi PcoteliL Sequence 
Protein Accession tts Z]P_149038.i 

1 IX 21 31 41 SI 

I I 1 1 I I 

KKAIIHI/XUlt AUtSVHTAXBT QGN9ABAVTT TETATfiGPTV AAADTTETHP PBTA8TTANT 
PSFSTAISPA PPIX8XaSS9 TXPTPAPFIX SIH88STIP1 PTAADSESTT MVNSZiAT^Z 
XTASSEMDGEr XTKVPSETQ8 NHEMSPTTED NQSSGPPTGX tanJBT&OJlSS TGPSNPCQDD 
PCADH8LCVK ISHTCSFCUZU BGYYYDSSTC taOSKVFPOKI SVTVSSTFDP BBKHSMAYQD 



60 

12D 

180 

240 

300 

360 

420 

4B0 

540 

600 

660 

720 

780 

840 

856 



£0 
120 
180 
240 
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10 



15 



40 



60 



IjHSEITSLFK DVPGTSVYGQ TVTIjTVSTSL SPRSEMRAOD KFVMVTIVTI liAETTSDKTEK 300 

TVTEKINKAI RSSSSNPLNY DLTLRCHYyQ CNQTADDCliN GIACDCKSDL QRPNPQSPFC 36 O 

VASSLKCPDA CNAQHKQCLl KKSGGAPECA CVPGyQEDAW GNCQKCRFOT SGLDCKDKPQ 420 

l.XI.TIV<3TXA. OIVZLSMIIA UVTAaSHNK TXBIEEEE4I.I DBDFQHLKLR STGFlHLCttUB 480 

GSVFPKWtlT AS&DSQMgNP YSRHSSMPRe DY S12 

Seq ID ND: C257 Protein Sequence 
Px-otein Accession #t NP_001423-I 

1 11 21 31 41 51 

I I I I i I 

MTAGRRMEML CAGRVPAUjL CLGFHI^IjQAV I*STTVIP3CI PQESSDNCTA L.VQTEDNPRV 6D 
AOVSITKCSS DMNGYCliHGQ CIYDVIIMgQK YCRCEVGYTG VRCKHPFLTV HQPI^KEYVA 120 
LTVILlILtX* ITVVt?STinfF CRUYllNRKSK SPKKEYERVT 9QDPBLPQV 169 

Seq ZD 1907 C25S Protein Sequence 
Sxofcein Accession #t AACfi3 902.1 



r^rk ^ 11 21 31 41 51 

20 I I I I I J 

MDRSKEHCIS GPVKAaTAPVQ QPKRVLVXQQ IPCQMPUVN BGQJlQlEVI,CP SNBSORVPLQ 60 

AQKLVSSHKP VQHQKQKQIiQ ATSVEHPVSR PUJNTQKSKQ P1.PSAPBNNP BEEIiASKQKN 120 

EBSldaiOHAX» £DFBIGRPLG KDKFGHVYIiA RZKQSKFILA LJCVLFKIUQI£ ICAgVEHQLSK 160 

^ - EVEIQSHUIB KfflZiRIiYQYF EDA7KVYI.IIj BYAPJUSTVYA ELQKEiSKFDG QRT&TYXTBL 240 

ZD AnaUiSYCaSK BVIHRDIKPB mJUUSSKOfEIi KIADPGnSVII AP&SRBTVXiC dTLDYLFPEH 300 

lEGSMHDBKV DLWSLSVLCT BFLVGKPPFE ANTYCSBTYKR laRVEFTFFD FVTBGARCXiI 360 

fiSZiUKHNPSO KFHLREVIfES FWIVAS&SKP SHCONKESAS KQS 403 

Seq ID HOi C2S9 Protein Sequence 
dU Probein Accessloii #e NP_0375D4.i 

1 11 21 31 41 51 

I I I I I I 

113RTAyTViG& LLLL LSTUiP AABC^KKKOSQ GAIPPPBKAQ BHDSECmSP QQPGSRNBGR 60 

^D GQGSGTJkMPQ EBVIfESSQEA LaVTSSSKYLK SBHCaCTDPIK QTIHBBGCN8 RTIZSiRFCYG 120 

QOaSFYIFRH IRKBEGSFQS CSFCKPfCKFT rmMVTLMCPB LOPPTKKKRV TRVSQCEbCIC ISO 

IDU> 3L84 



8eq :[D NDi C2G0 Protein Seqaence 
Protein Acceseion #i Bob sequence 



1 11 21 31 41 51 

i i ] I 1 I 

MKV GVLWLIfi FFTFXaOGHQQ FLGKNDGIXT KKBLIVHKSK HU3FVBB3E^ LLQfVTYBDSK 60 

43 BKRSljltErFLK LUCPPI.I.HSH OLIRirRAKA TTDCNSLHGV liQCTCEDSYT HFPFSCLDFQ 120 

KCYI^HXAGAI. PfiCSSCHIiNNIt 8QSVNFCBH,T ICIHQTPKINE XFTNDLLNS8 SAXYSKYANQ ISO 

IBIQUCKAYE RIQGPESVCfV TQPHNOSIVA GYBWGSSSA SELLfiRlEHV ABBAKTALHK 240 

T«FFIjEDG6FR VFOKAQamX VF6FGSKDDH YTLPC6SSXR i^ITAKCESS GHQVIBBXCV 300 

L SLLBE MIKW FSKIVGHAIB AAVSSFVQin. SVXXRONPST TVGHLASWS ZltSHXSSEiSL 360 

3U ASHFRVSHST HEDVTSZAI»I ZIir8ASVT»ff TVLLHSBKXA SSRLI^ETLSIS' ISTLVfiPTAXi 420 

PMIFSRKPID WKGIPV2IKSQ UtHGYSYQIK MCPQNT8XPI HGRVLIGSDQ TOtSMETII 480 

SMASLTLGKI LPVSKNQaWQ VHOPVISTVI QNYSINEVPL FFflKIBSNLS QPHCVFWDFfi 540 

HLQlaMDASCH IiVISETQDIVT GQCTHLTSFfi IIMSPFVPST IFFWKniTY VaiiOlSXQSIi 600 

XZiCDTZBALF 1«KQXKK&QTS HTICRXCHVHX ALSLLXADVH P2VQATVDTT VNPSGVCIAA 660 

33 ygg THPFttiS XtFFNMUtLGI LIAXIlXIZiVF HaMAOHUOCA 'VGFCLGVGGF LXXSVITIAV 720 

TUFSNTYICRK DVCnUOKSNO SKPZ»IAFVVP AIiAXVAVNFV WIJCiVLTKLPf RPTVGERLSR 780 

DDKRTIIRVQ KSIiliIIjTPMi GLTMGFGIGT rVDSQHrAMK VIFALLNAfQ OPFILCPGII* 840 

UDSKIiRQLIiP NKLSALS&HK QTEBQHSSDL BAKPKFSKPP HFIdCJNnOHYA PSBTGD66ia!l dOO 

2MUC0PV8HB 9X0 



Seq ID nOs C26X Pzoteln Sequence 
Protein Acceesdlon #s SIP 000575.1 



KTSKIiAVZ^Ii AAFiaSAALC aGAVXfSt&AK ElfiCQCIICTY SKPFRPKFUC EUtyXBSOPH 60 

CKHTEIXVXL SDGRSLCUIP KEHHVQRWB KFLKIIABHS 99 

WSeq ID 210 s C262 Protein Seqpence 
Protein Acceesion fts NP_005594.i 

^ 11 21 31 41 51 

I I I I I I 

M8TSRD9BTT FDBD5QSE9DB WFYSDDETB DSLDDQ6SAV BPEQNRU!NRE ABENREPFItK 60 

ECrWQVKfillD BKyHBQPHFH HTKFI>CXKBg KXAXntAXXTY KXHAFTFIEM NLFBQFKBAA 120 

MCYFIAliLIL QAVPQISTLA WTTTWPJJJV VLGVTAIKDL VDDVASHJMD KEINNRTCEV IBO 

XKDffilFKVAK WJCBIQVGEfVI RIiKKNDFVPA DZI/LLSSSEP NSLCYVETAE LDGETNUCFK 340 

MSLBXTDQYL QREDTLATPD GFI ECBB PliN RLDKFTGTLF WHHTflFPUDA DKILIiRGCVI 300 

R^rxUFCHOLV IFAGaDTKZH KHSGKEEtFKR TKmYZJORH VYTXPWItXZs L&AGZihlGHA 360 

OV YWHftQVGHSS WYLYDGEDDT PSYRSFLXFIT G7IIVLNTMV PZ6I.YVSVSV XRX.GQSHFIBT 420 

flDI^jMYYAEK DTPAKARTTT IiEIBQIjGQIHY iFSDKTGTIiT QiriMTFKXCC XNGQXYGDBR 4 BO 

DASOHUHUKI EQVDFSKNTY ADGKLAFXDH YLIEOIQSGK EPBVHQFPFli IiAVCHTVMVD S40 

RTDGOliHYQA ASPDBGALVN AARMFGFAFL ARTQ^ITXS £U?TERTYNV UVILDFH&DR 600 

KRHaXI-VRTP BGKXKLYC80 ADTVIYERZ^S RHHFTKQBTQ DAIJ3IFKNBT UlTLCIiCTEB 660 
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lEBKEFTEMM KKPMAABVAS TSmDBALDKV YEBIEKDIill. LGATAIEDKL QDOVPETISK 720 

LAKftDIKIWV LTGDKKETAE NIGFACEI*LrT EDTTICYGED IMSLIiHRRHE WQRNRGGVYA 7 BO 

KFAPPVQESF FPPGGMSAI.1 ITGSHIjMBXI. IiEKKTKRNKX LKLKFPRTBE ERRMRTQSKR S40 

RIiEAXKEf^ KNFVDLACEC &AVICCRVTP XQjBCAHWXiLV KRYKKZULTLA IGDGANDVSIM 900 

IKTAHIGVGI SOQEGMQAyH SSDYSFAQFR YLQRI.L1.VHG RHSYIKMCKF tAYTFyiCNFA 96D 

FTLVKPSTirSF FNGYSAQTAY EDWFITL*niV IiYTSLPVLLM GLLDQDVSDK L8£iRFPGLYI 1020 

VGQRDLLEWY KRFFVSLLHG VDTSHII.PPI PLGAYLQTVG QDGEAPSnYQ SPAVTIASAL 1080 

VITVNFQI6L DTSYHTFVHA FSIFG8IALY F6IKFDFKSA 6IHVLFP5AF QPTOTASHTU^ 11 4 D 

RQFYIWLTII IiTVAVCLbFV VAIRPIiSMTI WPSSSDKIQK HRKRLXABEQ HQRKQQVFItR 1200 

GVSTRRSAYA FSHQROyADIi ISGGRSIRKK RSPLDAIVAD GTABYItRTGD 8 1251 

&eq ID NO: C2€3 Protein fiequ^ce 
Pxotein AccesBion #; XM_044593 

1 11 21 3X 4X 5X 

I I I I I I 

MLRIT^LRS KLARPHGAI.P PRPPI»liJ»IfL IiIiIiIil.QPPPP TWALSPRISL PU35EBRPFIi €0 

RFEAEHISNY TAI.IJiSR£»6R IXYVGAKBAL FALSSNIiSFI. PGGEYQBLLH GMl^kEKKQQC 120 

SFKOEDPQItP CQNYIKILLP IiSQSBIiFTCa TAAFSFHCTY ZHKEaSETLAR DBSONVIiIiED 180 

GKGHCPFDPH FKSTADWCa ELYTQTVSSF QGNDPAISRS QSUZFTKrEfi filiMMLQDPAF 240 

VASAYIPESL GSLQGDDDKI YPFPSETGQB FEFFEaanVS RIARICKGDB GGERVIiQQRW 300 

TSFI/KRQIiLC eKPDDOFPFN VIjQDVFTLBP SPQDWRDTIiP YGVFT6QWHR GTTEGSAVCV 360 

PTMKDViQRVF SGI»YKEVNRE TQQWYTVTHP VPTPRR3ACI TNSARBRKIN SSI^QUDRVL 420 

MFZiKDBFXMD OQVItfiBMLl.It OPQRRVQKVA VaKVPGLHBT YDVLFLGTGD 6RLHKAVSV6 480 

PRVHXXBEIO ZFSSOQPVOH IJiUSTHRGLL TAASHSOWQ VSKWCSZiYR SCQDCUJUID 540 

PYCAH&G&SC KBV&LYQPQL ATRPWIODIE GASAKDLCSA 8BWSPSFVP TGSKPCSQVQ 600 

PQPNTVNTiA CPIiLSNIiATR LWI^RMOAPVN ASASCHVIjPT GDU>I«V<3rQQ XjGBFQCWSIjE 660 

EGFQQIfVASY CPEWEDGVA DQTDEGGSVP V1I3TSRVSA PAGGKASWGR ilRSYWKEFIiV 720 

MCTIaFVZiAVIi IiPVLFIaEiYIIH RBISHKinei*KQ G&CAGVHPKl' CPWLPPEIR 9LIR31CPPSX "700 

PU3IHR6YQ8L 8D8PPGSRVF TESEECRPXiSX 0DSFVEVSF7 CPRFRVRIfGS BXSD9W 637 

Seq ID NO; C264 Protein Sequence 
Prote±n Accepsion #s NP_G0a950.1 

1 11 21 31 41 51 

I 1 I I I I 

HASQNBDPAA TSVAAARKQA BPSOQAAROP V(3KRliQQHI/f TLMM&GDKGI fiAPPBSDHTLP 60 

KWVGTIHGAA 6TVYBDLRYK LSLEPPSGYP YKAPTVKFiyr PCYHPHVDTQ GMICW5II.KE 120 

KHSALYDVRT XUIiSZQSIiIiS EENIDSPUIT H&AELHKNPT AFXKYLQBTY SKQfVT&QB? 179 

9eq ID HQs C265 Ppoteln Sequexica 
'Protein Accession #s MP_pS5399.1 

1 11 21 31 41 SI 

45 I I I ] \ I 

MGRGHGFLFG l^USKVWULSB CaSGEEQPPBT AAQRCFCQVS GYIDDCIICDV ETIDRFHUYK €0 

LPPRLOKLLB SDYFRYYKVN LKRPCPFWMP XSQCCJRRDCA VKPOOSDEVP SQIKSASYKY 120 

SSEANMblEE CEC2AESIiQAV DE&LS&ETQK AVLOHTKEEED S8DNFCBADD IQ&PEAEYVD 180 

IiIiLNRERYTG YKGPJUWKIIff HVIYBEHCFK FQTXKRFiaP ZASQQGTSBB nrFYSHItBSI. 24D 

DU CVEKRAFYRL XfiGU£AS32iV HliSnSLYXtLQE THLEKHHGHH ITEFQQRFDG ILTBSB6PRR 300 

ZiKHLYFtiYItX EUQUUSKVXiP FFERPDFOLF TGKKZaDSEH KMLIiCiEirSE UCSFFZJXFDB 360 

HSFPAGDKKE AHKLKEDESI. HPRNISRIMD CVGCPKCRLK GKLQTQC3LGT ALKXLFSBKL 42 0 

lAHMFBSSPS YEEBIiTRiQEZ VSI^HHAFORZ STSVKELEHF SHLLQHIH 46 B 



10 
15 
20 
25 
30 
35 
40 



55 
60 
65 



75 



Seq ID NO: C2fi8 Pxotelc Sequence 
Pxotein Aofsessicn #s FGBHESH predicted 



Seq ID HO 7 C269 Protein SeQuence 
Protein Accession #s NP 002429.1 



Seq ID HDs C2ti6 Protein Sequoice 
Pxotein Accession Ut IJP_002B79.1 

i 11 21 31 41 SI 

I ] I 1 I I 

HQPRRQRLPA FWSGPRGPRP TAFZiirAIAlUL I>AFVAAPAOS GQPDDPGQPQ DAOVPHRLIiD 60 
QKARAALHFP HFASGePfiAL AVLAEVQaOL AWINPKBGCK VHWFSTERY HPESLLQBGE 120 
^UiQKCaARV FFKNQKPRFT JNVTCTHIiIE KKKHOQEDYl* liYKSQKKQIfKK FX^XVaiPDH ISO 
HGHIDPSLBL IHDLAFLGSS YVMHEMTTQV 8HYYI1AQLT8 VSQWVRST 22 B 

Seq ID KOs C267 Protein Sequence 
Protein Accession #s lUP 00^400,1 



1 XX 21 31 41 51 

' ( I I I 1 

/U MSVKGHAiaii AVILCATWQ GFPKFKRGRC I^OPGVKAV XVADXEKASX HYPSIWCDKI 60 
SVIXTLKEHK GQRCUflPK&K QARl.tXKKV£ RKNF 94 



1 11 21 31 41 51 

L ' ' ' • I 

MliRQIVIiRRGI* QSFCHRLGIfC VSRHFVFFLT VPAVIiTXTFG IiSAUIRFQFB GDI^ERLVAPS 60 

HSXAKZBRSI. ASSIf PibDQS KSQliYSDLBT l><SY6RVnJ[> SPTGDOfflUQ AEGZLQTHRA 120 

ox) VXJSMKVHHKG YXIYTFSSIiCir XiRNQDKKCVL DDllSVZiBDZa ROftAVailKTT AKVQVRYPNT 100 

KIiKVCSFCKL LPIKSAAI^ LP 2D2 
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1 11 21 31 41 51 

I I I I I f 

^ MRljPliLLVPA SVlPGAVmjL DTRQPLIYNE DHKRCVDAVfi PSAVQTAACH QDAESQKFRW GO 

J V3BSQXHSVA FKLCLGVPSK TDWVAITliYA CDfiKSEPQKM BCKNDTIAai KQEDLPIWYG 120 

NSQEKMlMliY KGSGLKSRHK IVGTTDNIiCS ItGYEJ^TLL GNANOATCAF PFKFENKWYA 160 

DCTSAGRSDG WLWCGTTTDY DTDKIiPGYCP UOFEXSSKSIW NKDPLTSVSY QIN&KSAIjTW 240 

HQARKSCQQQ HAEIfLSITEI HEQTYLTGIjT SStTfiGLWIG LNSI^FIXSGW QWSDR8PPRY 300 

LNWIjPGSPSA BEGKfiCVSLW PGKMAKWENIi ECVQKLGYIC KKCNTTLNSF VIP3ESDVPT 360 

10 HCPSQWWPYA GHCVKIHRDE KKIQRDALTT CBKEGGDIrTS IHTIEEItDPI ISQLGYEPND 420 

SLWIGLSDIK IQMYFBWSIX3 IWfPl ' K WIiR GBPSHEHHKQ BDCWHSOKD GYHADRGCEW 480 

FDOYICKKKS H8QGPEIVHV BKGCRBQHKK HHFYCyMXGH TI)8TFAB?INQ TCNHBHAYLT S40 

TIEDRTnEQAF LTSFVGLRPE KYFtTTGLSDI QTKGTPQHTI KEEVRFTHWN SDMPGRKP6C 600 

VAMRTGIAGG LWDVLKCDEK AKFVCKHMAB OVTHPPKPTT TPEPKCaPBGDW GASSRTSliCF €60 

15 KLYAKQKHEK KTWFSSSDFC KRUSGDImABI DHKEEQQTIH IUIjZTASGSYH KLFHX/QI/ryO 720 

SPSEaSFTHSD GSPVSyEUHA YGEPEOiyQIIV EXCBBLK6DP OIM&HHDINCE HLHHWIOQIQ 7B0 

IQSQTPKPEPT PAVQPNPFVT BDGHVIYKDY QYYFSKBKBT HDKriVRAFCKR NFGDIiVSXQS 840 

ESHKJLFbWKY VNRUOAQSAY FIGLLXSUnC JCFAHMDQSKV^ DYVSHATGEP NFAtnSDEHCV SOD 

TMYSNSGFWK DlNCGYPilAP ICQRHMBSIH ATTVMPTMPS VPSGCKEGWJ FYSHKCPKIP 960 

ZU GPMEBERKNH QEARKACI6F GGNIiVSIQHB KBQAFLTYBH KD5IFSA»Ta liXIDVNSBUTF 1020 

XAiTDGSGVHY TSVIGSSISPQO RRSSLSYSDA IXWIIGGAS NEAGKMHPDT O^SKItGYXOC! 1990 

VRSVPSVEHIP PATIQTDGEV 1CrGKS9YSZjM RQKFQHHEAE TYCaO^HBIdl*! ASILDFYSNA 1140 

FAm^QMETSN SKVWlALNSCT LTDEIQY1:HTD XHRVRYTKHA ADEPKUCSAC VYTiDLDaYHK 12 00 

TAHCNESFYP LCKRSDErPA TEPPQIiPORC PESDHTAWIP FfiGHCYYISS SYTHNWGQAS 1260 

ZD It&ClitlMGSSL VSIE6AA&&S PLBYRVSPLK SKTHPHIGLF RHVEGTHLWI UNSFVSFVHW 1320 

maSDPaSBSJSS UCVAZSASSG FWSKIHC&SY IOSYICKRPKX IEAKPIHELL TIKADTRKHD 1380 

PSKPSGHVA6 WIXVILLIL TGnSIAAYPF YKKKRVBliPQ BGAPEEITbYF NSQSSFGTSD 1440 

MKDIiVtBITSQ MSH5VI 145« 

30 8eq ID KOi C270 Prot:ein fiequence 
Protein Accesaion Bos sequence 

1 11 21 31 41 51 

I 1 1 I 1 I 

DO KVLLHWCLLH LLFSLSSRIQ KLPTKDBEItF QHQXItDKAFF HDSSVXPDQA SISSYLFBDT SO 

PKRYFFWEB DHTPLSVTVT PCDAPlifiWKL GLQSLPEDRe GEG8GDZ1BF& BQQKQQXINB 120 

BGTELFSYKG 2IDVEYFIB8S SPSGLYQH^ XtSTSKDTRFK VXATTTPBSD QFYPSLPYDP 180 

RVDVTSLORT TVTZJUNKPSP TASliLlCQPXQ YCWIHKSHH FK6LCAVEAEC I.6ADDAPMHA 240 

PKPGLDF&PF DiFAHFGFPSD lifSGKEItSFQA KPSFKI^GBHV YSRPKVDZQK XCXCSilKIItFT 300 

40 VSDXiKPDTQY YPDVFWSIXN SSHSTAYVGT FARTK&&ASCQ KIVHIiKDGKI TDVFVKRKGA 360 

KFI.RFAPV86 HQKVTFFIKS CLDAVQIQVR iUDGKLLLSQH VBQIQQPQLa 6KPKAKYLVR 420 

IiKCSnOCQASH ItKXIATTRPT KQSFPSLFBD TRXKAEDKEA TCBSATVMHi GTQERIIKFCI 480 

YKKEVDDNYH EDQ^EKRBQNQ CU3PDIR3CKS EKVLCKYPRS am.QKAVTTB TXKOLQFBKS 540 

TlilDVYVZOH GGBSVKXQSK WKTHXFC 568 

45 

Seq XD HOs C371 Protein Sequence 
Protein Acces&ioa «: AAH34a29.1 

1 11 21 31 41 51 

50 1 \ I I 1 I 

MEKVQLEFENT QEHEKRI^EF R8TRNKERED RBSSEYYKICB 6KVGKDVHQS YHHSaHBQNV «0 
VKFSAQIGrKIi KIiIilCEQXQHP VKFTVNYKMA NSSESCEKFKX BSSVOGQCEU ICAALLVCL&C 120 
GEDYCSGCFA NVHQRGAIiiCL 5KTTI»IiQARS QIZiFSTVLUVA SQFXKDVHPD EPKEEHETSTX IBO 
BTfiKXQHKPK SVUoQ&SSSE VEXTTHKE2AQ trrmOC&LLC EGSFDBEA&A QSFQEVtiSQH 240 
35 RTGNEDDNKK QNLHAAVKDS I£ECEVffnnj KXHRBFUTIB IiKEDlXiSYKB KIiHLSKERRT 300 
PQBQU'KCYQ IR^XHHKFL ^VHBGVLKMKr MKXVKVBRPK VMTQLFTOQ 349 

Seq ID NDt Ca72 Protein Seqoeace 
Protein AccesBloa #1 ]IP_078S€3.l 

oO 

X 11 21 31 41 31 

1 I i I 1 I 

HEKLWUCKHR RTPQEQXiFKK I.9DTFPEPHE TTGDAQClSQEf EHDSDfiDGBE TKVQHTALIiL 60 
PVETLHIERP EPSIiKIVELD DTYEEBFEBA ENXVFYKVKL ADADSORSCA FHDCQKHBFP 120 

05 YBMGXHQHKV PDKOKKDFIJf T.rT.TOaSSTYY KDNSKBBTSN TDFDHIVDPO VY88DXEKIB 180 
BSTSFERNUC SSCNIGLESNQ KSDDSCVSLE SKOTIiUSRDIi BEAPXEBKLS QDXXBSIfLS 240 
SXiYKRPSFEE &KTa3C8SXiXJr QEXACRfiKPl TKQYQSLERP PXEDTHERLH LLPSHRIiBCai 300 
ETSSTRITLAK nREWIFDHSL 8BYACE!IArVI, 6VLQGAQSP8 SSRHQQraiQQ KSQRFSTAiaF 3tiQ 
PLSETSVKESS fiCIiSSSHPitS SSAAT^SSSR AAfiBI^XBY XDXTDQlilBLS IjDDTTDQHTL 420 

70 XaSflsEKELQVIj SSIJVDTGBKIj YSLTSSBFPD FSSQSLKIISQ ISXDRLKTBa VRGPGGVEEb 480 
&CSGRDTKXQ SI»I>9LSESST DEEBEDFZ4RC OBVITLPHSK ST 522 

Seq IP HQs C273 Protein sequence 
Protein Accession #x hpjD0S399.x 

1 11 21 31 41 51 

I 1 I I 1 I 

KECVSAVXiIiCL IiIiMTAAEHPQ GLAQPBALHV PSTCCFTFSS KKXSLQRLK8 YVITT8RCFQ 60 
KAVlFItnCX^ KBXCADPKEK HVOHYHKHLQ RKABTLKT 98 

Seg XD NO I C274 Protein Sequence 
Protein Accession #1 BAC051S8.1 

» 

X 11 ai 3L 41 51 
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5 



t I I I I i 

MFI»r.TGGVSL BCSAEKNPDPT WLQDKSWEEI CaAS£Fi>AFR GI.RQHFCEHI YEWREIYDSK SO 

EPHiStAKFPAP MDKNIVNELOK IIILRCLRPD KITPAITNYV TDKLGKKFVB PPPFDLTKSY 120 

LDSHCTIPLI FVLSPOADPM ABLLKFANDK SMSGNKFQAZ SI^GOOPIA AKMIKA3aSB 160 

6TWVCLQNCS IAVSVn4PHI.E KirPPFTSEI OIS8FRLWLT SYPS£KFFVT ILQN6VKMTN 240 

6PPT6Z>RLNL LQSYliTDPVS DPEFFKGCRG ZELIiPUJKYD TIPFEAISYX* TGSGMYGQHV 300 

IDDNDRRUiD TMLADPYNIjY XVSMPHYKFS PStaJYf At>PX GTY3SDYI£FI KKLPPTQHPE 360 

IFGLHENVDX SKDIiQ<2TiCrLi F££L£iLTQ6G SKQTGAGGST DQILLEITKD IIiNKLPSDFD 420 

IBMADRKYPV RYEESMWTVi:. VQEMERFKMI. IITIRNTtRD laEWaKGVW HDSALEALSS 4B0 

lU 6m>V6KVPEl WUCRSYPSIiK PLG&YITDFL ARliNFLQDHV H66KPCVFMI* SGFFPTQAFI> 540 

TCSAMQNYARK YTTFIDLLGY EFEVIPSDT8 ST8PBDGVYI HGLYLDGARH SRESG&IABQ 600 

YPKliLFDIiMP IIKIKFTQKS RZlKSDiAYVC PLYXTSBRKO TliSTTGHSTH FVIAKLLKID 6€D 

QPTRHWIKRG S79 

15 Seq ID NO; C275 Protein Sequence 
Protein ACiceBBian ftt AAA60212.1 



20 



1 11 31 31 41 51 

I 1 I I I I 

HABSHZiDQnii UiU>PTLC0P QTAANTTSSL ACAQOFESPHC QSI^EQALQCS ALOHCLQEVH 60 

GEVGADDLCQ SCEDIVRILI7 KHAKEAIFQD TMRKFIiBQBC NVLPLKUW QCNQVLDDyP 120 

FI>V1DYFQNQ TDSHGICMHIa QLCKSHQPEP BQEFGKSDPXa PKPUtDPLPD PMJ3FX»VLPV 180 

LPGAMJARPG PHTQDLSBQQ FPIPLPYCWL CRALIKRIQA HIPKGALAVA VAQVCPWPL 240 

VAGGIOQCLA ERYSVIIiIiDT IiIiOHMIiPQljV Caa:.VIiHCSMD DSJViSPRfiPTG BKIiPRDSBCH 300 

LCHSVTTQAG N3SBQAXFQA MLOACVGSNI* DaEKCKQFVE QHTPQUiTZtV PKSilElAHTTC 360 

QKDGVGCTMB SFliQCZHSFD L 381 



30 
35 
40 
45 



55 
60 
65 
70 
75 
80 



Seq ID HOs C276 Protein Sequence 
Protein AccesBion #c NP_631 911.1 

1 11 21 31 41 51 

I I I 1 I 1 

HLGOGXPAEiG LTiUJiOGSAD CBHGIQGTFYP WSCBGDIHUR BSCGGQAAID SFHIiCUStiRC 60 
CYRNGVCYRQ UPDENVRRKK MHALWTC£G LLLLSCSIdi FVIWAKfifiDVI* HMPGFIiAGPC 120 
DMSKSVgrjI>3 KHRGTKKTPS TOSVPVALSK BSRDVBQQTB QEQTEEGEET EGEEEED 177 

Seq ID HO: C277 Protciji Sequence 
Protein Accession #: NP_473364,1 

1 11 21 31 41 51 

I I I I I I 

MKLVXIPLLV l-ISLCSYSAT AFLIIIKVPLP VDICLAPX^PLD HILPFMDPLK UAJCTLGISV 60 

EHIiVEGl'RKC VHEIX?PE7^E AVKKI>IiEAL5 HlaV 93 

Seq ED HO: C278 Protein Sequence 
PEotein Accession 41 > FOBNBSH predicted 



1 11 21 31 41 51 

1 ' ^ • 1 ^ 

MPLGYAYKHA &TIi»SRBTG8 HMfiRGAYORK NTRAAiSRPBB CTDHSWHAOR TRGXIOiGQEiE €0 

EHCSDVFGVS FFHWVRGIJW3 JK3AKUQTFTP AQE6APTVQR QAEALUCCRQ SGRPGBGG9UB 120 

SESU\RDA8ML &PL&AAMRMY PT£&TXPPRk SYSPXEIAHK SY&CGliPDMK ISHAfi&GPGIr 160 

DSUSIlkEDaE SOSPPLVTHIj YETj<3VVTTG« EQLDFBTGPN IPDIiQIYVKD EVGVTDLQVlj 240 

TVQVTDVNSP PQFQGIilliT^ HCRADQPHFN AH8HTYVRW ATALASHRLR S8I6SPFLGT 300 

FCWVaHQTF IilSPPKSPRM SAHGTDFSTT ELDFBAGURS FHIirVEVIlDS OGLKASTELQ 360 

VHIVSHiNDBV PRFTSPTKVY TVLEECiSPGT IVAKITAHDP DDBSPPSHLL YSITTVSKYF 420 

HINQI.TGTIQ VAORIDRDAQ ELRQSIPTZSI* ETLVKDRFYa GQENRIQITF XVBDVHDNPA 480 

TOQKFTFRE8 LHPALCSKTI> THMDTVIJ)CF HAADKDIPVT GRFTKBRGI»I 6LTVPBGWG8 540 

I/rZHABGKBB QVTa»4IX3SR QRDRACVOKL LLIKPSDIJHR IsSSYHQiH^G KTCPHDS19S 600 

TQVPPTTCRW SRIQATNNED TSSVTVTVNI USBNDBKPIC TENSYFLALP VDLKVGTN3Q 660 

NPIOUECTDIIjD SSPR&FRY&X GPGHVKHsJIIFT F£PNAj3fitNVT Rldil/rSRFDY AOGFDKIfOTY 720 

KLliVyVTDDM IiMSDEKKAEA. liVETGTVTEiS IKVTPHPm ITTTPRPRVT YQVmKNVYS 780 

P8ASfYVP£rVI TLGSILLLGI. IiVYIiWUJUC AIERHCPCKT GKHKEPI.TKK GBTKIAERDV 840 

WETmHHTI FDOEAJDPEP EQASI^BLYAZi LPSCCSPSPV T£iRICVaV€GE SBBTGQCSOa 900 

ZTIiPGiaPVD DPRICQBTGLQ Q>FSVNTLCP AVKWVGGPQ AEIbdRLALS LKKYG8D 957 



8eq ID HO: C279 Protein Sequence 
Protein Acceseiou #x XP__168S71.1 

1 11 21 31 41 51 

I I I I I 1 

MINQLTGTIQ VAQRIDRDAG ELRQNPTISL EVLVKDRPYG GQEMRIQITP rVBEWHmiPA 60 

TOQKFTFSIK VPERTAKOTIi LUDUilKFCFD DDSEAFNNRF NFTMPSSVGS GSRFLQDPAG 120 

SGSCrVLIGDIi DYEHPSKItAA GEIKYTVIIQfV QDVAPPYYKN NVYVYIIjTSP ENEFPLIFDR 180 

PfiYVPmrSER RPAiQ{3EI*SGP BEiaaiZs9ia>I VRAVCHHFGIi HIASOSPKVP GRPIGQSBPQ 240 

TliPLomfEBQ GT&DKSRRKFB DCRERRRGGN YPDESYL 277 

Seq ID NO: C2eo Protein Segyence 
Protein AcceBBioiL #s HP^005257.2 

1 11 21 31 41 51 

111111 

MODHaFLGlSF LEEVHKHSTV VGKVBTbTVLF IFRMlrVIiGTA AE88WGDBQA DFRCDTIQPG 60 
C3QHVCYDQAP PISKZSYHVIi QIIFV81PSL VYMGEAMBTV SHQBKRKliRK AERAKBVBGS 120 
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GSYEYPVAEK AEI^SCWEEGN GRIAI>QQTI>L KTYVCailiTR TTMBVQFIVG QYFIYGIPLT 180 

TLHVCKRSPC PHPVNCYVSR PTERNVFIVP E^fLAVAMiSLL IjSIiAEIjYHLG WKKIRQRFVK 240 

PRQBMAKCQL SGPSVGIVQS CTPPPDENQC LBKGPGGKPP NPPSNMMASQ QNTDHLVTEQ 300 

^ VRGQBQTPQE GFIQVRY<K2K PBVPHGVSPQ HRZ»PHGUISD XRRLSKASSK ARSDDIiSV 350 

Seq ID VOi C2B1 P3»t«iA Sequence 
Protein Accession #e K7_055274.2 

^ ^ 1 11 21 31 41 51 

10 1 1 \ 1 I I 

MYIiSXCCCFL IiWAPAIiTlrKK I.KYSVPEEQ6 AGTVIGETIGR DARIiQPGLPP AERGGGGRSK 60 

SSSYHVLEH6 APMLLDVDAD SGLI.YTKQRI CRESLCRBKA KCQLSLEVFA ^KEICHIKV 120 

EIQDIKDHAP SPSSDQIEMD ISEMAAPGTR PPLTSAHDPD AGEHGliRTYL LntDDHGLFG 180 

LDVK^RGDGT KFPELVIQKA 'U3REQ0HI1KT liVLTALDGGE PPR6ATVQXVI VK7ZDSHDN5 240 

15 PVFEAPSYIiV ELPEEIAPIiGT WIDLNATDA DEGPNGEVIiY SFSSYVPBRV RKLFGIDPKT 300 

GLIRVKGMLD YBENGMLEID VQARDDQPNP IPAHC!KVTVK LmRHDMAPS IGFVSVHQGA 360 

LSEAAPPGTV lALVKVTDRD SGKNGQIiQCR VLGGGGTGGG t3C5LQGPGC5SV PFKLKENYDH 420 

BYTWTDHPL DRETQDEYW TIVARDCSGSP PUISTKSFAI KILDEJJlDNPP RPTKCSLYVLQ 480 

VHEMUIPGEy LQSVIiAODPD LGQNGTVSYS IlrPSHIGDVS IYTYV3VHPT MGAIYALRSF 540 

ZO NPEQTKAFEF KVLAKDSGAP AELBSNATVR VTVUJVNDEIA FVXVIfPTLQSI DTAELQVPRN 600 

AGLGTLVSTV RALDSDFQES GRUTYSIVITG NDDHLFEIDP SSGEZHTLHP FBTBDVTPWE 660 

LWKVTDHGK PTLBAVAKLI IRSV0GSLPB OVPHVHGEQE MWDMSLPblV TLSTISIJIiL 720 

AAKITJAVKC XRENXEIRTY NCRIAEY8HP QLGGGKGKKK KIHKHDIKIAr QSEVSERKAM 780 

NVKDIWSSPS liATSPHYFDY QTRLPX*SSPR SEVMYLKPAS £INLTVPQGHA GCHTSFTGQa S40 

TKASBTPATR MS11Q7DNFP AEPKYMG&RQ QFVQSISVAP SIiRTQICBPA 889 



25 
30 
35 



Seq ID NO I C262 Proteiii SeQuence 
Protein Acceseion #i iap_0O55d2.1 

1 11 21 31 41 51 

I 1 1 I I I 

HBLCRSUUiCt GGSLQIiMFCL lALSTDFWFB AVGPTESAHS GLWPTQHODI ISGYlHVT(yr 60 

F^IKAVItWAZi VSVSFX.V1J&C FPSLiFPPGEO PEiVSTTAAFA AAIGKWAMA VYTSBRWDQP IZO 
PBPQIQTFFS HSFYIiGWVSA ZI&LCr6AIi& LGAEOQGPRP GYSTL 165 

Seq ID HO: C2S3 Protein 8eq[uence 
ProteixL Accession #s NF 0Q6424.2 



1 11 21 31 41 51 

40 1 1 i I I I 

HATHAZjIiIiLA AMLJUSHPGLV FSRIiSPEYYD lAKABLSDBB KSCPCliAQEG PQGDLLTKTQ 60 
ELGRDYRTCL TXVQKLKKMV DKPTQRSVSU AATHVCKTGR GRWRDVCRtfF HRRYQSRVTQ 120 
GLVAGETAQQ ICBDLRliCIP STGPL 14 B 

45 Seq ID DOs C2B4 Proteatn Sequence 
Protean Accession NP 005594.1 



50 



1 11 21 .31 41 SI 

I I 1 I I I 

MKVSAAALAV ILIATALCAP ASASFYSSDT TPCCS'AYIAR PLPRAHIKEY FYTSC^KCSiaP 60 
AWFVTRKNR OVCAHPBiacW VRBVIK8LEH 8 91 



Seq ID WOt C2e5 Protein Sequence 
J J Protein. Acce 88 ion HP_071437.1 

1 11 21 31 41 51 

\ \ \ \ ] \ 

^ HA&GRAVA6L LI.LAAAGLGG VAEGPGUVFS BDVI^SVFGAN liSLSAAQIiQB lAEQ^SAASR 60 

60 VOVPEFGQI^ FNQC2iTAEBI FSl£GF&M3\!r QXISSKFSVI CPAVE<QQ£iNF UPCEDEtPKHK 120 

TRPSESEVWG Y6FLSVTIIN lAShLQI^TUT PIiIKKSYFFK XIiTFFVGIiAI GTXtPSt3AlFO ISO 

IiIPSAFBFDP KVDSYVBKAV AVEGGFYIiLF FFBRMLKMLI* KTYGQ^KSHTH FCSnSSIFGPQE 240 

XTHQPKALPA INGVTCyAISP AVTEKXIGHIH FZMVfiWSLQ DGKKBPS9CT CLKOPKIiSEl 300 

^ OTIAXmiTLC DAI£NFI£}GL AIGA8CTL8L LQGLSTSZAJ LCEEFFBEU3 DFVILZiHAGH 360 

05 STRQALIiSXlF I«9ACSCyVCIZ* AFGIDVONHP AESZIFAIAG GMPI.YISXAD MFnB3HDtmiKUl 420 

EKVTGRKTDF TFFMXQNAGM LTGFEAILI«X TLYAGBrSIiB 460 



70 



Seg ID HO: C286 Protein Seqjuence 
Protein Accession #i filP D041TJS.1 



1 11 21 31 41 51 

I I I ; I I . 

HEN8BPASLL BI.FHSXATIQa BLVRBbKAO)! A8KDBID8AV KHTiVSLKHSY ICAAAGSDYKA 60 

75 BCPPISIPAPT 5MHGSDATBA EEDFVDPHTV QTSSAKSIDY DKliIVRFGeS KIDKBIrlNRX 120 

ERATGQRPHH FIJIRGIFPSH RDMNQfVlJDAY EHIOCPPYLYT GRGP9SBAHH VaHIiIPPIFT 180 

KWLQDVFNVP liVlOMTDDEK YIWKDI.TXJ3Q AYODAVSE3AK DlIAiGGFDIN KTFIFSEaJlY 240 

VGKGGGFYVa^ WKIQKHVTF NQVKGXFGFT DSDCIGKXSF FAIQAAPSFS KSFPQIFRDR 300 

TDXQCLIPOI. ZIXSDPYFRHT RDVAPJUQYP KPAUjHSTFF PALQGAQTKH GASDPBTSSIF 360 

oU LTDTAKQXKT RVHiOlAFSGG RDTIEB&RQF KXI/FFFLEDD DKIiBQIKKDY 420 

TSGAKI/CGEL KXAIiIEVIiQP LXAEBQAIIRK EVTDBIVKBF MTPBXL8EDF Q 471 



6eq ID NOf C287 Protiein Sequence 
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Pxutelxi Accosfiion #i hp_004929.i 

i 11 21 31 41 51 

t I 1 I i t 

MTVFRQSNVD DYVSTOBEIiG 8GQFAWKKC REKSTGIiQYA. AKFIKKRRTK SSRRGV^ED 60 

IHRSVSIIiECE: IQHPNVITLH EVYEHKTDVI LILELV3W3QE LFDFIjAEICES LTEEEATEPL 12D 

KQTLNGVYYI, HSLQIAHFDIj KPENIMU^BR NVPKPRIICII DFQliAHKlDF GNBFKNIFGT 180 

PEFVA7EIVH YEPLGLEADM t4SIGVZTiriI> tiSGASfFLGD TKQ&TLAHV8 AUNYEFHDBY 240 

F&UTfiALAEX) FIRRLbVICDP KKSMTIQDSI. QHFWIKPKDT QQAIiSBXftSA VNMBKFIQC&A 300 

ARKKHKQSVR LISLCQRIiSR SFI>SRS»MSV ASSDDTIj&EB DSFVMKAIIH AZHDDKVF6L 360 

QHT.TiCSIjSHY DVNQPNKHOT PPKLIAAGGG NIQILQLLIK RGSRXITVQOK OGSNAVYHAA. 420 

RHGHVETTLKF LSENKCPLDV JODKfiGEMMJI VAARYQHADV AQVTCAASAQ IPISRTKEEE 480 

TFUICAANHG YYSVAKAIjCE: A£3Ca«VNlKNR SSBS9hUIRB AR6YHDIVEC LAEHGADUZA 340 

CDKD^IAIiH IiAVRRCQKEV IKTLLSQGCF VDVODREEQNT PlaHVAOCDGir HPIWALCEA 600 

JStCSSHSDISmKf GRTPUBIiAAN lasiIiDWRrL CLHGASVEAL TmaKTA&DD ARSBQHEHVA 660 

GIiIxARLRKDT HRGLFIQQLR PTQ2ffLQPRIK DKIiPGHSGSa XTTI»VESLKC eLLRSFFRHR 720 

RPRLSSTNSS HFPPSPIiASK PTVSVSlMNli YPGCENVSVR SR8MMFEPGL TKGKbBVFVA 7B0 

PTHHPHCSAD DQSTKAIDIQ NAYIKGVGDP SVWEFSOHFV YPOCXDYFAA NDPTSIHWV 840 

FSLEEFYBID LNPVIFHL5F LKSIjVPVBEP ZAFGGKLKNP LQWlrVATHA DIHNVPRPAG 900 

GEFGYDKDTG LLKEIRHRFG NDIJ1I6NKLF VLDAGASGSK PMKVXiRfilBIiQ BIHSgXVSVC 9fiO 

PPKTHECTKI ISTLPSHRKIi NQPHQXJKSLQ QFVyDVQDQIj SlFLASSEDLR RIAQQLHSTG 1020 

BUflXMQfiETV QDVLLLDPRW LCTNVLGKLL SVSTPRALGH YRQKYTVEDI QRLVPDSDVS 10 BO 

EUiQZIiDAMD ICftSDIiSSOT MVDVPAIjllCr mLaRSHADB SDEVMVYGGV RXVPVEHiyrP 1140 

PPOGIFHKVQ VNIiCSKIHQQ STEGDADIRIi WVNISCKIiAHR OAKLLVI^LVN HGQGIEVQVR 1200 

GLETEKIKCX: I^LIiDSVCSTZ HNVMAlTTLPG UiTVKHYLSP QQLREHHEPV HXyDFRDFFR 1260 

AQTLKET&LT NTI43GYKESF SSXMCFGCHD VYSQASLGMD IHASDUfllJiT RRKIifiRlIjPF 1320 

PDPLOKDWCL LAMNIiGIjPDli VAKXNTHNGA PKDFLPSPLH AI4>RBWTTYP ESTVOTLMSK 1380 

IjRSLGRRDAA Plil^KASSVF KTm^SWQQB AYASSCNSGT GYKSISfiWfi R 1431 

Seg ID NOi C2S8 PzoteizL Sequence 
Protein Accession #: I3P^002 072.1 

1 11 21 31 41 51 

] \ \ \ \ \ 

MBLSAROHWIj ZiCAAAAIjVAC ARGDFA5KSR SOGEVRQIYG AKGFSLSDVP QAEISQEHIiR 60 

ICPQGYTCCT SEHESmKNR SHAELETAI.^ DSSRVIiQAI^ Ai:QIaR8FDDR F<3HLI.HDSER 120 

TLOATFPGAF GEItYTONARA FRDLYSBLRD YYRGANLHIjE EXXAEFWARIj UERI^XQUSB IfiO 

QLLLVTfDYVO CLGKQABAIjR PFGEAPRHLR URATRAFVAA RSFVQSLGVA fiOWRKVACfV 240 

FI/3PECSRAV HKLVYCAHCL 6VPGARPCPD YCRNVIiXGCIi AMQAIIZJSAEV BMLXOSHVIiI 3 00 

TDKFWGTSGV E3VIGSVET14 liABAIHAJjQD HSDTLTAZVI QG06»PKVli)P QGPGPEEKRR 360 

RGKUlfRBRF P&GnfEKLVG BAKAQUOJVQ DFHISLFGTL CSBRHAIiSTA. SDDRCHKQHA 420 

RGRYZ*PEVMG BGLKSQUttll? EVEVDITKBD MTJRQQiHQZ* KIMIHRIA&A YKGBIDVDFQD 480 

ASDDaSGSGS 490 

Seq TD NQs C289 Proteizi Sequence 
Procein Accession AAH30205,1 

1 11 21 31 41 51 

1 t I 1 I I 

HIILIYLFUj LKBDTQGWGF KDGZEHHSIli IiERAAOVYBR BMtSQKXnn: YAEAKAVCEF 60 

EOOHUITYKQ IiEAARKXGHH VCAAGHMAKG RVGYPIVSCPG PNGGFGKTGI IDYGXRIiURS 120 

ERWDAYCYNP HAKBCJGGVFT DPKQIFKSPO PPHBYEDKQI CYWHIRLlCYQ OHIHLSFLDF 180 

jyhSDDVQCiA DYVBIYD&YD SVHGFVGRYC GDSjPDDIIS TGHVHTIjKFL SIIASVTAGOF 240 

QZKYVMIDFV 5KSSQGXKTS TTSTGSIKllFZi AGRP^HI. 277 

Seq ZD NO I C290 Protein Seijii*rtre 
Protein Accession l^s MP_001973.1 

1 11 21 31 41 51 

I 1 I I I I 

MRASIDALQVIi 6IiIiFSI>ARGS EVGNSQAVCP GTLHOLSVTG DAEHQYQrXiY IdiYSRCBWM 60 

taniBrVTiTGH laADLSFlK^I REV7G2CVI>VA MHEFSTLPLP HLKWRGTQV YCGKFAIFVH 120 

i;tmnHSSSA LROLRZiTOIjT BXLSGGVYIE SiaDKIiCHHDT IZngSDIVBDR DAEIWKDHG 180 

HSCPP QIEVC KGRCHePGSB DGQmXKTIC APQCHGHCFG ENFNQOCHDE CAGGCSQPQD 240 

TDCFACRHFN DSQACVPRCP QPLVYHKIiTF QlfimmxRV QYG6VCQASC PEMFWDQTS 300 

CVRACPPDKM EVDK23GLKMC BPCXfGLCPKA CEGTGSGSRF QTVDSSIVXSG FVNCTKIIiON 360 

UiFliITGLNG DPWHKIPALD PEKIKVFRTV REITSYUSng SWPPHMBNFfi VFBMLTTIGG 420 

RSliYNRGFSL LIMKtruaV7& IGFHSIiKBIS AGRIYISKNR QLCYHESXiNH TKVXiROFTBE 480 

aUDIXHHRPR RDCVABGKVG DPIiCSSOSCH QPOFOQCL&C RNYSRGGVCV XHOIFUIGBP 540 

REFAHEABCF SC31PBCQPHO GTATCSGSOS Dlt:!AOCAEFR DGPECVS8CP HGVLGASGPI 600 

YKYEDVOSEC RPCHENCTQG CRtSPBLQDCL GQTLVIiIOKT HI»TWALTVIA GLWIEHMljG 660 

GTPliYWRGHH XQKKRAMRR7 IiERGESZBPL DPSBKAHKVXt ARIFKETKLR KbKVIiGSGVF 720 

GTVHRQVnlP BGE8XK1FVC ZKVIEDKSGR QSFOAVTDHK liAXaSISHAH XVRLIiGLCFa 780 

SSLQLVTQYL PTjGSLXJIHVR QHRSAIdGFQZ. IlLHWCVQIAK GMYVZiSBaSH VHRNDAARNV 840 

LX.KdPSQVav ADFGVADLIjP PDDnQLLYSB AKTPXICBIMAIi BSIEFGECYTH QSDVWSYGVT 900 

VWELMTPGAH FYAGLRIAEV POTJiEiaSEHL AQPQlCTlIJV YMVMVKCWMX DEMIRPTPKB 960 

UUVBFTSMAH CPFRYXiVlKR E6GPG1APGP EPEGLTNKKIi BBVELEPELD £.DU3l£AEED 102 0 

NZATTILGSA IiSI>PVGT£HR PROSQI^ULSP S5GYKPMNQG £tLOGfiQQE8A VS68S6RCPR 1080 

EVSZAFMPSG diASBSSEGH VTG8B1VELQB XVSHCRSRSR SBSPRPBaDS JOESORBShh 1140 

TFVTFL8PPG Z.BBEDVSIBYV MFDTHLK&TP SSSBGTXiSSV GLeSVXGTEE EDSDBBYByH 1200 

rxRssatseevH pprpssiiebe, gyeymdvosd IiSASQDgstqs cpI)BPVpxmp iMSTrfiDBDy 12 60 

EYMZflRQRDOG GPQGDYAAMG ACPASBQGYB EnRAFQGPQH QAPIIVHrARI> KTIAtSIiEATD 1320 
SAFQNFDYKH SRIiFPKANAQ RT 1342 
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Seq ID NO: C291 Probelii Sequence 
Proteixi Accession »: KP_0012D7.1 

- 1 11 21 31 41 SI 

5 I I I I I I 

MRPLCPSPWL PLI/IPAPAPG I»TVQliIiSLI, LLHPVHPQHI, PRHQEDSPLQ GGSSGEDDPL 60 

GEBDLPSEED 5PREEDFF6E EDLPGEEDLP GBBDLPEVKF KSEE&GSLKL EDLPTVBAPG 120 

DPQEPONHAH RDKEGDDQSH WRYQODPPWP HVfiPACAGRF QSPVD3RPQD AJ^FCPAMIPI* 180 

Et*LGPQLPPL PELRIiHNNGH SVQLTLPPGL EMALGPGEBY RAUQLHLHWQ MfiRPGSEHT 240 

10 VBGHRFPA&I HWHI^STAFA RVDB^^GRFG GLAVIiAAFliE 3SGPEEl<r&AYE QLLSRIjEBIA 300 

BBGSETQVPG LDI8ALLPSD FSRlTFQyEQS Z>TTPPCAQGV IHTTH9QTVM laSAEQIiHTIiS 350 

DTIiWaPGDSR ijQXJJFRATQP LHGRVIEASF PAGVDSSPRA ABEVQUISCL AAGDHALVF 420 

GLLPAVTSVA FLVQMRRQHR EGTKGGV5YR PAEWiETGA 459 

15 Seq ID mO; C292 Protein Sequence 
Protein AfsCeBBXOo. #s 2»P_00419-8.1 

1 11 21 31 41 51 

1 I 1 I I I 

/A) MGGAWCEGP TGVKAPZX3GH GHAVLFGCFV ITGPSYAFPK AVSVFFKBIjI QEFGIGY&DT 60 

AHieSILLAM LYGT6PLCBV CVNRFGCRPV MliVGaLFASlt GHVAA8PCRS IZQVYLTTGV 120 

ZTGIdGIAZaiF QPSblHLlSRy F&KRRfiMAHC lAAAGSPVTIi CAIiSPLGQU. QDRYGWRGGF 180 

LXIiGGaJIiLNC CVC7^MRPI« WTAQPGSQP FRF&RRliUJli 8VFRDRGFVL YAVAASVMVL 240 

GLFVPPVFW flYAKDLGVPD TKAAFliTIL GFIPIFAUPA A0FVAQljaKtf HPYBVYLPSF 300 

Z5 SMFFNGLADL AG5TAGDXGG I>WFC1FFGI GYGMVGALQF EVLMAIVGTH KF9SAZGIiVIi 360 

XMEAVAVIjVG PPfiGGKLLDA THVYHYVFII* AGABVI/TS^Ii lULLOBrFFCX KKKPKBPQPS 420 

VAAAESSKLH KFPADSGVDZj RSVEEFIiKAB PBSCHGEWHT PETSV 4^5 

^ Seq ID NOs C293 Protein Bequence 
30 Protein Accession #: NP_000349.l 

1 11 21 31 41 51 

} I I 1 I I 

HAIiFVIlI*IiAli AliALALGPAA TLAGPAKSPY QLVIiQESRlA GRQHGPHVCA VQKVIGTNRK 60 

35 YFTNCKQWYQ RKICX3KSTVI SYECCPGVEK VEGEKGCPAA LPLSNl»YBTt GWQSTTTQL 120 

TTDRTEKZiR? £Hfi6P6&FTI FAPStOEAHAS liPAEVU)SI.V &HVinKLL£3A LRYHMVGRRV 180 

LlDBLiaiGMT IiTSHYQIZSNZ QZEHYFN6IV TVNCASLLKA I3HHA7NGVVH I^IDKVISTIT 240 

NNIQQIIEIB DTFSTLRAAV AASGIiSITMLB GHOOYTIiI^AP TN£?^£KIPS ETLNRILC^P 300 

EAIiRDIiDCQSH lUCSAMCABA iVAGliSVETL EGXTLEVGCS GDMl^TINGKA IXSHKDXIiAT 360 

40 NOVXHYIXIEL LIPDSAKTLF SLAAE80VST ArDIiFRQAai» GNHIiSGSERl^ TLLAPLtTSVF 420 

KDGTPPmAE TRNIiIfillHIX imOliASKXLy UGQTLBTLGG KKLRVFVYRH SIiCIBfSCiA 480 

AHI3KRGRyGT IflMOfiVLTP PHGTVHDVIjK GPNRFSMIjVA AlQfiAGtiTBT LSREGVYTVF 540 

AP^nfSAFRAI> PPRSRSRU^ nAKmiANILK YHIGDBILVB GGIGALVBliK SI^ySDKIfn/a 600 

liKHNWSVNK SPV2\SP£)IMA TNGWBVITN VLOPPAHSFQ BRODELADSA USZFHQASAF 660 

45 GRAGQRSVRIi APVXQKLLER VSm 683 

&eq ID NO: C294 Protein Sequence 
Protein Accession #: )irp^006527.1 

50 1 11 21 31 41 51 

I I i / i ) ) 

litTORSIAGPI <3niKFVTI>LV ALSSELPFLG AOVQIiQIISiaY fil^TilAlKPQ VPBaQNLISH 60 
IKEHITEASF YXtFKATHREV FFRNIKIIjIP ATMKANmSK IKQESY^SKAN VIVTDHYGRH 120 
<3DDPYTIK2YR 60GKBGKYIH FTPHFIiUICN ItTAGZGSRGR VFVEEHAHUEL WGVFDEmHD 180 

55 KPFYXNGOKIQ IKVTRCSSDl TGIFVCEKSP CPQEHCIISK LFKB6CTFXY KSTQBATASI 240 
MFHQSL55W EFCSASTHMQ EAPiaXiQnQMC SliRSAHinrrr D&ADFHESFP MNGTELPPPP 300 
TPSLVQAGDK WdiVLDVSS XMAERDKLLQ LQQAASPYLM QIVEIHTFVG lASKDSKGBI 360 
RAQLHQIMSN DDRKLIiVflYL PTTVSAKTDI SICgGI.KKBF EVVEKIJ«3KA YGSVMILVTS 420 
GODKIiU^TCE. PTVI.S5GSTI ESIALGGGAA PKLBBLBRLT GGItKFFVPDI SHSNSMISAF 480 

qO SRISSGTGDI FQQHIQLEST GEfilVKPEHaiL XHTVTVDNTV GfiQXnfFLVTff QASGPPEIIX> 540 

FDPDGRKYYT lOTPI^ranjTKR tTAELWlPGIA KPG3HTYTLEI HTHHSLQAIiK VTVTSRA8I23& 600 
AVPPATVHAF VERDS1*BDPPH PVMIYAKVKQ 6FYPlIiI$IATV TATVEPETGD PVTTiRUiDDG S60 
AGADVIXZIDS lYSRYFFSFA AHGRYSL3CVE VNHgP3XSXP AHdXPGSHAM YVPGYTAIiK^ 720 
. XOMiaAPSKSV GRBESERRna FSRV&SGGSP SVIiGVPAGPH SDVFFPCKIl DLEAVKVEEB 780 

05 IjILSHTAEGE DFDQGQATSY BISMSKSLQN XOODEHinuIr lnn!&aCB«fQQ AGIRBIFTF8 840 
FQISTNGFBH QPHGETHESH RIYVAXRTOfD RNSLQBAVSH lAiQAPUFXPP NSDFVPARDY 900 
LILKGVLTAH ^IGIICLIX WTHHTLBSK KRADKKEHGT KUt 943 

Seq ID NO: 0295 Protein Sequence 
/O Protein Accession #; Eos sequence 

1 11 21 31 41 51 

1 i I t I I 

MKFI,LII>I»LQ ATA8GALPLH SSTSLBKNZ^ IjFGERYLEKF YGLSIKKLPV TKHKYSOSUf 60 
75 KEK1QS4QEP liQLKVTGaiJ} T5TLEMMUAP RGGVPDVHHF REMPGGFV0R 2EEY1TYRINN 120 

YTPPMi9Rm)V DYAIRKAFQ7 HSNVTPIiKPS iClimSHADIL WFiVRGAEGD FBAFDGKGGT 180 
LAHAFGFGSa TQGDAHFDBD EFnTTBSGGT HXiFLTAVBBX GKSIiGLGnBS DPKAVHFPTY 240 
KYVDINTFHIj SAODIRGIQS LYGDPKEHQR liPNPDNSEPA XiCDPIirLSFDA VTTVGNKXFF 300 
FKDRFFNLKV SERPKTSVNIi XSSTLIIFTLPS QIEAAYBIEA RHOUFLFEDD KYNUfiEnUEU? 360 
OU BCNYPKSIHS FGFPNFVKKI DAAVEMPRFY RTYPFVONQY MRYDSRSQMH DPQYFKLITK 420 
NFQGXGPEOD AVPYSKHKYY YFFQG8HQFE YDFLLQRITK TLKSHSHPGC . 470 

Seq ID NOs C296 Pzoteln Sequence 
Protein Accesaion tts Eos sequence 

1351 
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1 11 21 31 41 SI 

I I I I I I 

MKFLIjlljtiliQ ATASGfll>PljN SSTSIiEKtMV liFGERYIjEKF YGLSINKLPV T!KMKYSC3?LM 60 

KEKIQBMQHF LGLKVTGQIJ3 TSTLEMMHAP RCGVPDVHHF SBMPGGPVHR KHYITYRINH 3.20 

YTVmaSWEUV SY&IRKAFQV HSHVTPLKFS KIHTGHADII:* WFARQAHGD EKAFDGKGGI 180 

LAHAFGPGSG lOGDAEFDED BFHTTHSGGT HI.FLTAVHAI GHSLGLGBSS DPKKWFFTY 340 

KYVDINTPRl. SADDIRGIQS LYGDPKEMQR LIWPEWSEPR I,CDFNI»SFDA VTTW^JKIPF 300 

PKDHPPWLKV SERPKTSVNI. I9SLWPTLPS GIKAAYKIKA BNQVPliFKDD KY«1jISNLRP 360 

SPMYPKSIHS FGFPI^FVKKI DAAVFNPUFY RIYFFVDNQY KAYDERRQMM DPGTPKblTX 420 

VFQGXGPKID AVPYSIOIKYY YFFQGSliiQFS YDFIiIiQRITK TIiXSHSHFGC 470 

6eq XD HO: C297 Protein Sequence 
ProbeiiOL Accession #s HP 008883.1 



1 11 21 31 41 51 

I I I I 1 I 

MAKEWStVRC PQGI»I»IFGNV XIGCCGIALT AKCIPFVSDQ EBLYPIiL£2VT DliilDDiyGAAK 60 

IGIFVGIdaP diSVliOIVOI MKSSRKlUCiA YPILMFIVYA FEVASCITAA TQRDFFTHilL 120 

20 PliKQMLERYQ HNSPPHIHDDQ HKNHGVTKTH DRXJ4LQDtTGC GVNBPSDHQK YTSAFSXJBEIEf 180 

mmrPliIPRQC CVKHNIiKEPIi miEACKZiOVP 6FYHHQGCy& I1ISGPMGIRH& ftGVAWFGFiU 240 

IiCWTFHVLLG THFYNSRISY 26D 

Seq ID HOs C29S Protein Se<iuencc 
25 Protein Access ion #i NP_001784.2 

1 11 21 31 41 51 

) I 1 1 1 I 

MaiiPRQPIAS liliLbQVCWLQ CAASEPCRAV PREABVTIiEA GGABQEPGOA USKVFMGCPG 60 

30 QEPAI^STDN DDFTVBKGET VQERRGZiKER NPLKIFPfiKR ILRBEKKDWV VAPISVFEHQ ^20 

HQPFFQRLNQ DKSNKDRDTK IFYSITGFQA DSPPEGV7AV EKETGHLIiLK XBUXBlSElMS. 180 

YSLFGHAVSE HQftSVEDPMN ISIIVTDQMD HKPKFTQDTF RGSVLBGVIiP GTSVHQfVTAT 240 

DEDDAIYTYN QWAYSlHeQ EPKDPHDLMF TIHRSTGTIS VigSt3IiDREK VPEYTLTIQA 300 

TDMDG^GSTT TAVAWEIWJ ANDNAPMFDP QKYEiUtVPEK AVEHBVQRLT VTDIiDAPNSP 360 

35 AHRAIYIilMG GDCGDHFTrT THPBSHQGlIi TTRnBU>FEA KHQHTIiYVEV TKEAPFVItXIi 420 

PTSTATIWH VBDVNEAPVF VPPSKWSVQ EICS1VTGS3SVC VYTAEDPDKE HQKXSlTRXCiR 480 

DPAGWIAHDP S5GQVTAVGT LDREDEQFVR VNIYEWKVIA HnHOSPPTTG TOTLLLTLID 540 

VNDHGFVFEP RQITIdlQSP DIiSPHTSPFQ AQLT[>DBDXY WTAEVKEEGD 600 

TWLaijKKFIj KQDTYDVHLe LSDHKaSKBQL TVIRATVCDC HGHVETCPOP WKGQFILPVI> 660 

4-0 GAVLAItLFLL IiVLLLt«V]2KK RKXKSPIiLIiP EDDTJUSIVPY YGEBGGGEED QDYDITQIiHR 720 

GItBARPBWZi BHEVAPTIXP TSHYRPBPAN PDBIGHFIXB HUOAHXDPT APPYDTUIWF 780 

DYBGS6SDAA SIiSSLTSSAS DQDainroYUT BHOaSFKijClih DWrGGGEDD 635 

Seq ID NO; C299 Protein Sequience 
45 Protein Acceesion #: HP_oos620.1 

1 11 21 31 41 51 

) I I 1 i I 

MAXICSAEHOX Y^SQDEKKG PIjIAPGEDGA PAKSDGPVGL GTFGGRXiAVP FBETHTRQKD 6D 

FIMSCVGPAV GliGNVNRFPY LCYKHGGGVP Z>rPYVI>IAI.V GGIPIFPI^BI GLGQPr4E»3S 120 

HaVWHICPLF KSU3YASKVI VFYQlTyYIK VLAHGFYyiAT KSFTTTXtPWA TCqHT H HTPD 180 

CVEIFECEBDC ANT^SLAHLTC DQIADBRSPV IBFHENXVUl I1GG6LEVPGA UflffEVTLCLt 340 

A£3WVIiVYPCV WRSVKSTQKI VYPTATPPYV VLWM.VRGV IjIiPGAUDCJII YTLKPDWaKIi 300 

GSPQVWIEUW3 TQIFPSYAIO liGALTALOSY NRFWNBICYKD AIILALINSG TSFPAGFWP 360 

55 SIUSEMAABQ GVH16KVAB8 GPGLAFIAYP RAVTIiKPVAP IiHAAIiFFFMli LIiUSLDSQFV 420 

6VB6FITGLL DXAPASmfTR FQRBI&VAXiC CAI£FVIDLS MVTDGGHYVF QLEDVYSASO 480 

TTIiLWQAFHB CVWAHVXGA I^IFMDDIACH ZSyBPCPHMK WSRSFWTPW GHSIFIEHW 540 

YYEBimrnr TmsmtsBm obiafaii8&hii cvpxaiiLGCL XAMOSiMftEH nqbltqfzkg goo 

UlHXfaitAQD ADVRGL'TTI/r PVSBSSKWV VESVH 635 



Seq ID NO 3 C30O Protein Sequence 
Protein Accessicsn -ft: 2IP 006507.1 



1 11 21 31 41 51 

65 I ] I I I I 

MBPSfiKKZ.Ta SIHLAVGGAV LG8LQF6YHT GVIiaAPQKVI BEFVNQTHVH KZaESXXfTT 60 

UTILIfSLSVA IFSVSGNIOS FSVaLFVKRF QBBNSMLMMN LXAFYSAVIM 6FSKLGRSFE 120 

MLIU^PIIG VY06LTTGFV PMYVGEVSPT AFRGALGTXiH QLOIWOIIiI AQVPGliDSIH 180 

<9RI>IiWPLLIt 8IIFIPAU1Q CXVIjPFqPES PRFIiLIlIHinS £BlRAKSVIiKK LRGTADVTED 240 

70 I'QEHKEESRQ HKRBKKVTIZ. ELFReSAYKO PILIAWLQL &QQI>SSINAV FYYSTSIFBK 300 

AGVQQFVYAT IGSGIVNTAF TWSI^FWBR A6IZRITiHI.XQ IMStOUXZKLli MTlAZAXiLBQ 360 

X^HMSTLSrV AIFGFVAFFB VGPOPXPHFZ VABUPSQGPR FAAIAVAGFS IWTSNFJVBN 420 

CFQrVBQIiOS PYVFXIFTVIi LVIf FIFTYF XVPETRCRTF DBIASGFRlQG GA8Q&DKCPB 480 

BIimPLGADS QV 492 



Seq 2D NOs C301 Protein Sequence 
Protein Accession #s XP_035292.2 



„^ 1 11 21 31 41 51 

80 I I I I 1 I 

HI^GAGPKSRA IiAAPAA&BEB EABSSMLAAK SADQSAPABB GBGVTLQRNI TLIMGVAIIV 60 

GTIIGS6IFV TPTOVXiKBAG SPGLALWHA AjQGVFSrVGA LCYAEUSTTI SKSOODXATK 120 

IiEVYOSIiPAF ZJONIBItl*!! SPfiSQYXVAZi VFATYUiBPIi FFTCPV^CA AKUnbCLCVL 180 

LIAAVNCYBV KAATRVOCAF AAAKLUOAIi IILLGFVQIG ]B3DVSMU>KEl F8FBGTKI2IV 240 
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QHIVLRDYSQ IjPAYGGWNYIi NFVTEEMTNP YENIjPLAIII SliPIVTLVYV LTHIiAYFTTIi 300 

SrBQMIrSSBA VAVDFOmrHL GVMSWIIFVF VGLSCPGSVN GSLPTSSRbF PVGSREGHLP 360 

SIIiSMIHPQIt IiTPVPSLVFT CVWTIXYAPS KDIPSVINFF SFEMWIiCVAI* AIIGMIHUiH 420 

RKPELERPIK VHUUbFVPFl LACltFLlAVS FWKTPVEOGI GFTIILSGLF VYPPGVWWKN 480 

5 KPKHLIiQGIF STTVLOQKLM QWPQET 507 

fieq ID NO: C302 Protein Sequence 
Protein AcccBalon #= NP_005259.1 

10 1 11 21 31 41 51 

1 I I I I I 

MHWgiFEGLL SGVMKYSTAF GRIWLSLVFI FRVLVYIVTA ERVWeDDHKD FDCNTHQPGC 60 

SHVCFDBFFF VSKVRLWAIiQ IilEiVTCPSIib WMHVAYRBV QEKRHH^ASG E37SGRIiTUiP 120 

QKKaOGIimtT 1CVC8LVFKA8 VDIAFLYVFH SFYPKYILPP WKCUADFCP NIVDCFISKS 100 

13 GElQTIFTliFK VATAAICJZiIj NIiVEIriyiiVS KfiCBECLAAR KAQAMCTGHH PHQTTSSCKO 240 

BDIiLSGDliIP LGSDSHPPLL PDRPRDKVKK TII» 273 

9eq IS NO: C303 Protein Sequence 
- . Protein Accesalen #t 1IP_D05121.1 

1 11 21 31 41 51 

t I t I I I 

MKICSLTUjS FLLIiAAQVLLi VEQKKKVKUG liHSKWSEQK X>TLQt7TQXKQ KSRPGKKGKF €0 

^. VTKDQAHCRH AATEQEEGIS IJCVECTQItDH EF8CVFAGNP TSCI»KLK]7ER WWKQVASNIi 120 

2^ RSQia>iCRy& ktavktrvcr kdfpbsslkl vssmiE<ssnK prkbktemsp rbhikgkett xbo 

pgSTiAVTQTM ATKRPBCVBD PIWkNQRKTA XiEFCGBmSS LCTFFL81VQ DTSC 234 

Seq ID NDx C304 Protein Sequence 
Protein Accession #s AAH22542 

1 11 21 31 41 51 

I I i { ! I 

MCSEIILl^E VLKDGFaRDli LIKVKFGBSI £DLHTCRLLI KQDIPAOIiYV DPYELASIiRE 60 

BN1TEAVMV3 EMFDIBAENY LSKBSSVIiIY ARHDSQCIDC PQAFLPVHCR YHRPHSEDGE 120 

35 ASIWNHPDL UIFCDQAGSR RMIRFRFD8P DKTIEFPILK CHAHSBVAAP CALENEDICQ 180 

WNKMKYKSVy KNVIZiQVPWG I>TVHTSIiVCS VTIiLITIJ^S KKKKK 22S 

Seq ZD NOt C305 Protein Seqaence 
Protein Accession HP 004985.1 
40 ^ 

1 11 21 31 41 51 

1111)1 

HSLWQPLVLV LLVUSGCFAA PX^QRQ&TIiVZ* FPCa3IjRTHI*T DRQIiAEEYXfY RYOYTRVAEM 60 

RSBSKSDQPA XiCtWUCOLSXi PETGEEkDdAT LKAMRTPROG VPDLGRPQTF EGDLKStHHfim 120 

4D lTYf(£QNYSB DLFRAViraA FASAFALHSA WPEnTFTRW SR^ADIVIQF GVAEHGDGYP 180 

FTCICDGItLAH AFPPGPG1Q6 DAHFIOSDELW SLGK6VWPT RFta3ADGA2u:r H?PFIFEGRS 240 

Y&ACTTDGRS DGDPHCSTTA NYDTDDRFQF CPSHRLYTRD GtfADGKPCQF PFIFQGQSY3 300 

ACTTDGRSDG YRWCATTANY DHDKLF6FCP TAADSTVnsa HSAGBLCVFP FTFLGKEYST 360 

CT8EGRGD6R LHCATTSHFZ> SDXSnQFCFD QBY^FLVAA HEFGHALQLD HSSVPEALHY 420 

FMYRFTJBQFP UOCDDVIIGIR BLYGPRPEPE PRFFTTTTPQ PTAPPTVCPT GPPTVHPEER 480 

PTAGPTGPPfi AGSXGPPTAQ PSTATXVFLS PVinDAiOIVNI FDAIASXONQ IkYIiFICDGKYW 540 

RFSEGRSSRP QGFFLIADKW PALFRKLD^ FEBPLSKELF FFSGRQVWVY TGASVLGPRR €00 

LDKLGLGADV AQVTGAIiRGa ROKMUtFSGR BliMRFZTVKAQ HVDPRSASEV DBHFPGVPU) 6^0 

TEIZ3VFQnU5K AYFOQDRFYW ZCVSSRSEXjBIQ -VDQVGTVTVD IliQCPSD 707 

Seq ZD lODt C306 Protein Sequence 
Protein Accession SiP_00D2Q4 

1 11 21 31 41 51 

60 I I 1 \ I I 

KAGPRPSFHA RI^LZAAIrlSV 8LS6TIiA23RC KKAFVKSCTE C:VRVDKDC!AY CmEMFRDRR £0 

CNTQAHLliAA GCQRESIVVM EdfiFQ^TEBI QlDTTIiRRSQ MSPQGIrRVSIr RPGEERHFEL 120 

EVFEPMaPV P LYIMTO FSH SHSDDIiDNIiK KHC3QNIARVL SQLTSmTIG FQKFVtMCV&V 180 

FQTDHRPEKIi KEPUBN&DPF FSPKHVISLT BDVDBFBaKIi QQSBZSOIZD AFEGGFUAIIi 240 

OJ QTAVCXRDZG WR9D9THI.I>V FSTESAFinrB ASI3AX9VLAGX MSRHDERCKL STTGTyTQYR 300 

TQDYPSVPTh VRLIiAKHNIl PIFAVTKYSY SYYEKLHTYF PVSSLGVM3E DgSNIVSW^ 360 

BAPNRIRSNIt DIRAIiDSPRG IjRTEVTSKHF QKTRTGSFHI RRGEVGIYQV QLRALEHVDG 420 

THVCQLiPEDQ KGHIHLKPSF SDGZiKMEAGI ICDVCTCEIiQ KBVRSARCSP NODPVCGQCV 480 

CaBGHSOQTC HCBTOSLSDI QPCXmGBDK FCSOBGEGQC GHCVCVGBGR YEGQFCB^DM 540 

rU FQCBBTfiGFL CHDRGaCSHC? aCVCBFOHTO FfiQ3CFIjiSHA ^TCIDSNGGIC HGRGHCEGGR 600 
COCHQQSIjYT BTICEIHYSA IHPGLCEZ>IA SCVQGQAnGO? GEKRQR!rCE5 CHFKVKMVDB 660 
UCRT^EVWR CSFBSEDDDC TYSYTMEODO APGP1{3TVX>V HKKKDCPPGS FWHIiIPIiLLIf 720 
IOjPLLAIiIjIiIi LCHKYCACCK ACLAI^IjPOCH RGHKVSFKBD HYMLRBHLMA SDHmrPKLR 780 

_ . SGHUDGRDW RttKVraBaMQR PGFATHAASX UPTELVFYGIj SIiREiKRXiCIB NZrLKPDTKBC 840 

7^ AQLRQEVEEI7 XiNEVYRQISS VBKLQQTKFR QQPNAGKKOD HTXVlKCVlMA PREAKPAIiUC 900 
LTEKQVEORA FHDLKVAPOY yTUTADQDAR G04VEFQEGVE LVDVKVPLPr RPEDJQDEKQL 960 
I.VEAIDVPAG TATLGRRLVN I^IXXBQARD VV8FBQFEFS VfiSODQVARl PVIRHVIiDCSQ 1020 
K&QVSUn.TQD GTAQGNRDYZ PVBGELLFQP QBAHKBLQVK EJiELQBVDSL I^RORQVRRFH 1080 
VQLSKPKFGA HLGQPHSTTI IJCBDPDEIiDR SFTSCSHL&GQ PPPBSDLGAP QSnREiAXAAGS 1140 

OU RKXHEimiiPF SGKEWGYRVK YWIQCZ^SESB AHLUSSKVP? VBEiTSLYFYC DYSIKVCAYG 1200 
AQGBGPYSSli VSC31THQEVP SEPQRIAFNV VSSrVTQLSW AEPAETNGEI TAYEVCY6LV 1260 
NDDNRPIGFM KKVLVDNPKN RMLLIKBILRE SQPYRTTVKA ENGAGW<3PER ERIIBIIATQP 1320 
KRFHSZPmP DIPIVDAQSQ EDYDSFliKYS DDVIAEPGGG QRPSVSDDTG OGWKFEPLLG 13 BO 
SEUXIfiRVTH RLPFELIPRIi SASSGRS&DA EAFTAFRTTA ASAGRAAAVP ASATPGPPOB 1440 
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15 
20 



45 



HDVK(3iU«a3FA FPGSUfSLHR MTTT9BAAYG THLSPHVPHR VLSTSSXIjTR DYNSLTHSEH 1500 

SHBTTLPRDV STLTSVSSHD SRI,TAGVPDT PTRLVFSAUS PTSLKVSWQB PRCERPLQGY 15 6 O 

SVEYQiiLNGG ELHRLWIPNP AQTfiVWElXL LPNHSYVFRV RAQSQEGWGR EStEGVllX'BS 1620 

QVHPQSPLCP IiPOSAPTIiST PSAPGPItVFT AliSPDSLQLS HERPRKEHOD IVOYLVTCEM 1680 

AQGOGPATAF RVDGDSPBSR t/TVPGhBESSV FYKFKVQAST TEOFGPEREG IIIIB5QDGG 1740 

PFPQLGSRAG IiFQHPI.QS£Y S61TTTETSA TBPFI.VDGLT liGAQHIfEAaQ SI>TJUIVTQEP 1800 

VSRILTTSGT LSTHHDQQFF QT 1822 



9«q ID IK); ca07 Fzotein Sequence 
lU Probeixx Accession ft? hp_076404,1 



1 11 21 31 

III) 



41 


51 




1 

iSTIVLPVZjYL 


1 

ZIFVASILLN 




DAQFGPWYFK 


FILCRYTSVZi 


120 


LSVCVWVIMA 


VLSLPWIILT 


IBO 


IBIGCriAIS 


RYIHKfiSRQF 


240 


DRIvLDBSAQK 


I1.YYCKEITL 


300 


IiQSVRRSSVR 


lYYDYTDV 


358 



Seq ID HDt C308 Protein Sequence 
Protein Accession ftt NP 06584D.1 



OK } " 31 41 51 

MVWCTiOriAVI, SLVIfiQGADQ ROKPEWSW GRABBSWLG CDI/LPPAGRP PIiHVIEWLRF fiO 

GFIiLPIFIQi' GLYgPRlDPJD YVGRVItLQKQ ASIOIBGLRV EDQQWYECaiV FFLDQHIPED 120 

UFANOSStfVm* TVNSPPQFQE TPPAVIOTQB LBPVTIJICWA RGSPLEHVTW KUU3KDU3QG 180 

QGQVQVQNGT LtOtCRVEnGS S6VyTC3QA8fi TEGSftTHATQ LLVLQPPVIV VPFKNSTttH^ 240 

DV SQDVSIACHA EAYPAMLTYS KFODMIHVFH ISRLQPRVQI LVDOSLRLLA TQPDDAQCYT 300 

CVPSNGErliHP PfiA&AYljTVI> CMPCTIRCPV KAKPFU>FVS KTKDQKAIiQI* DKFPGW8QGT 360 

BGSLIIALGH BDALGEYSCT PYNSLOTAGP 8PVTRVLLKA PPAPIERPICE EYFQEVQREL 42 D 

I»IPCSAQGDP PFWSWTKVG RGLQGOAQVD SKrgsi.ILRPI. TXERHGHKBC SASMAVARVA 4flO 

rSTHVYVLCTT SPHWIMVSV VftLPKGRNVS ffEPCFDSSVIi aaFSVWYTPI. AKBSDRMEHD 540 

HVSIAVPVGA AELLVPGLOP HTQVQFSVLA OmUiGSSPFS BIVLSAFB8I. PTTP2UVPGI.P 600 

PTEIPPKLSP PR6LVAVRTP IIC5VI.KHWDPP ELVPKHUJGY VT^RQGSQG WEVLDPAVAO S6t> 

TETELLVPGL IKDVliYEPRI, VAFAGSFVSD PSNTANVSTS GLBVYPSRTO LPGLLPQPVL 72 0 

AGwaoVCFIi GVAVLVSIIA GCLIiNHRRftA RRRHKRLHQD PPIiIFSPTCK SAAPSAI^SG 780 

SPJJSVAKLKL QOSFVPSIiRQ SLLMGDPAGX FSFBPDPPS8 RGPIiPLSPXC RGPDGRFVHG 840 

PTVAAPQBRS GREQAEPRTP AiQIREaARSFDC ESSSPaOAPQ PI.CIED16PV APPPAAPP9P 900 

XiPGPGPWJY LSIfFFREMN' VDCDMPPtjES PSPAAPPDYM £ITRRC!PTS3F LRSPETPPVS 960 

PRESDPGAW GAGATAEPPY TAIADWTLRE RLLPQLLPAA PRGSLTflQSe BROSASFLRP 1020 

PSTAPSftGGS YLSPAPGDTS SNASGP£RHP RRKHWTVSK RHHTflVDENY BHDSEFPGDM 1080 

BUiBTIiHXiGL ASSRLRPEAB TEL8VKTPBE GCLt^iTAaVT 6PBASOUUCA BBPXJUPRRBR 1140 

DATHAHI.PAY RQPVPHPBQA TMi 1163 

Seq IS NOt C30d Protein Sequence 
Protein Accession #s 

50 1 11 21 31 41 51 

J I 1 ] I 

55 
60 
65 



BYRliGPIiliGK 


GGFGTVFAQH 


EliTnRLQVai 


60 


GAGGGHPGVI 


RUiEJHPETQE 


GEMLVl^ERPL 


120 


IQHCHSRiSVV 


HRDIKDENXL 


ZDLBRGCAKb 


IfiO 


RHOymiiPAT 


VNGLGZLEiYD 


HVOGDXPFEH 


240 


F88RP&LS&I 
H9QO 


IiI23PHKQTPA 


EDVTPQPLQR 


300 
334 



Seq ID HOe C310 Protein Sequence 
ProteiXL Accession Us HP 0<»250X.l 



a- 11 21 31 41 SI 

i 1 t I I j 

WHCLYYFLGF LIiIiAARIiPLD AAKRFHDVLQ NBRPSAYHRE BKCMIGIfSSD ENDHNBXLYP 60 

VHKIM3DMRWK WSMBSGRvaA VUrSDSPALV GSaiTFAVNIi ZFPRCQKEDA UOOIVWNC 120 

HHBAGLSADP YVYNWrAKSB 30SDGKBIQTGQ SHBNVFPDGK WPHHPGMHR WHPIYVPHTL 180 

GQYFQKLGaC SVSVSVNTAS VTUGPQLMEV TVYRRHORAY VPIftQVKDVY WTDQIPVFV 240 

TMFQKJaDRNS fiDETFLKDLP IMBDVliXHDP SHFIOTSTIK YKWgPODHTG LFVSTNHTVM 300 

TOHTYVU9aTF^ LEILTVKftAAP GPCPPPPPPP aPSKFTPSLQ PAGDN^USLS RIPDOIGQIN 3fiD 

KYGMPOATIT XVBSILEVMI lOWTDVLMPV PWMSSfiELrDF WTOQGSIPT BVCTIiaoPT 420 

CEITQNTVCS PVEVDSlCIiL TVBRXfiMGSQ TYCVNLTLGD DTSEAUTSTIi ISVPDRDPA6 480 

PliRMKNaAH SVGCLAIPVT VISXtLVYKKH KEmPIBHSP <aNWRSlQSZj6 VFWSAKAVF 540 

FPO^QBKDPIj UOOQEFKGVS 5£{) 

75 Seq II> 270: C311 Protein Se^ience 
Protein Accession #; £os 9eq 

1 11 21 31 41 SI 

QO I i i I I 1 

OU MRlIrKHFIAC JQIJ.CVCRLD WAHGYYRQQR KLVEEIGHSY TGBOiHQRlimO KKYPTCN9FK 60 

QSPIUIDBDIi rrQVHV30LKKI. KFQGOIOKX&L ENTFIHNTGK TVE129IimDY RIVSYSG^SEMV 120 

FKASKITFBn CKCSSKSSDGS ERSliBaQKFp LEHQIYCFDA DRFSgPBEAV KGKGKEiRAIjS 180 

mPEVGTBEaKr LDFKAIIDGV ESVSRFGKQA ALDPFIIiiaJI. LPWerOKYYl YNHSIiTSPPC 240 

TDTVDWIVPK DTVSI9&SQL AVFCBVItTHQ QSGYVHIiHDY LQHNFRBQQfY FySRQVFSSY 300 
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TGKEBIHBAV CSSEPSHVQI^ DPEEtYTSLlrV TETBRPRWYD TKI&KFAVI»Y QQLDGEDCnK 3fi0 

HEFLTDCnOD U3&ZI(HHI.I.P NM8YVLQIVA ICOMGLYGKy SDQXilVDMPT DHPELDIiFFS 420 

LIGTBEIIKE EEB6KDIEEG AZVNFORDSA TNQIRKKEPQ ISTTTHYHRX GTKYNEAKTKr 480 

R9PrRG9EFS QRGDVPNTSL NSTSQPVTKIi ATEaa3lSI,Tfi QTVTEIjPPHT VEGTSASLND 540 

5 GSKTVIiRSPa MHLSGTASSI. NTVSITEYEE ESLLTSFKZiD TGASDSSGSS PAT8AZPFIS 600 

EHISQGYIFS SEHPETITVD VIilPBSARNA SEDSTfifiGSE BSLKDP&MBS NVKFFSSTDI 660 

TAQFDVOS(3R ESFLQTHYTE IUVDBSEinrr KSF62USVMS QSFSVTDLEK PHY5TFAYFP 720 

TEVTPHAFTP SSRjQQDIjVST VSmrYSQa-TQ PVY21EA9HSS EESRIGUVEG liESEKKAVIP 780 

LVrVSAbTPI CLVVLVGIlil YHRKCPQTAH PYLBDSTSPR VISTPPTPIF PISDDVGAIP 840 

10 IKHFPKHVAD LHASSSFTES FEEVQSCTVD LGITADSSHH PDSIKHKHRYr N^VAYDRSRV 90O 

KLAQLABKDG KLTI^YUIHNY VDSmRPRAY XAAOQEPIiX&T AEDFWRMXWS HKVEVIVKXT 960 

miVEIQQRRKC DQYHPADGSB BVGHFLVTQK SVQVIAYYTV RHFTIiRiaTKI KKGSQKQRP5 1020 

GKWTQYHYT QWPDMGVPEY SLPVLTFVBK AAYAKHHAVG PVWHCSAGV GRIGTYIVtO 1080 

SMLQQIQHBG TVNIFGFLKH IHSQRNYWQ TEEQYVF3HD Ti»VBAILSKE TEVI^JSHIHA 1140 

15 YVSIALLXPGP AGKTKLEKQF QLliSQSNIC2Q SDYSAAI.KQC HBBKNRTSSZ IPVERSRVGl 12 OO 

5SIiSOBGT3jy INASYIHBYY QSNEFIirOfll PU£TIKDFI7 BHinDHHAQD WKIPDGQtlM 1260 

AEDBFVynPH KDEPINCBSF KVTLHftEEHK CLSHEEKLXI QDFILEATQD DYVLEVSHFQ 1320 

CPKHPNEDfiP ISKTPEEiISV iKBBAANRtX? PHIVBDEHGG VTAGTPCALT TLHHQIaBKENT 1380 

SVDVYQVAKM INZMRP6VFA DIEC2fZQFLYK VIZiSZiVSTSQ BBNPSTSUDS NGAALPDGNI 1440 

20 AESIiESIiV 1448 

Seq ID NOs C312 Protein Sequence 
Protein Acceasion #s XP_031379 

2S 1 11 21 31 41 51 

I I I I I 1 

^aaLKRFIAC IOLIiCVCRZ»D HAHGYXRQQR IOiVEEIGHSY TGKUlQKimG KKYPTCngPK 60 

QSPIHZDBDL TQVUVHIiKKL KFQGHDKTSb EHTPIHNT^K TVEINLTHDY RVSGGVSEMV 120 

FKASiaTPHM GKCNMSSDQS ETlBLEGQKPP I*EMQIYCFDA DRPSSPKEAV KGKGKEJIAI.S 100 

30 ZLFBVGTBEN LDFKAIXDGV BSVSaFOiKQA AUDPFX&liNL LPHSTDKXYI YNGSLTSPPC 240 

TDTVWIVPK DTV8ISE8QIJ AVFCEVIiTHQ QS6YVMXMDY LONNPRBCSgy KPfiRQVFSSY 300 

TQKEEIHEAV CSSEFmVQA DPBHYTGIibV TNFRPKWYD TtaSKFAVX^Y QQM96EDQfnC B60 

HEFLTDGYQD IXSAILNNIJjP NMSYVLQIVA ICTHQLYGKY SDQIJV13MPT DNPBIJ>IiPPE 420 

UOTSCIIKE EEEGKDIBEG AXVHPORDSA TNQIRKKEPQ ISTTTEYNRX GTKYNSAKTN 480 

35 SSPTRGSEFS GKGDVPKTSIi HSTSQPVTKCt ATEKDlSIiTS QTVTEI>PPET VBGTSASCHD 540 

GSKTVIASPH mniSOTAESI. KTVSXTBYBB ESLLTSFKI.D TGAED8SGSS FATSAIPFIS 600 

EErXSGOyZPS SEMPBTITYD -VXilPESARIlA SEDdTSGGfiS EaX>KDPaHBe mVHFPSSTDI 660 

TAQPDVGBGR ESPIKJTNfYTS IRVDESEKTT KSPSAGPVMg QGPeVTIJLEM PHYfiTPAYPP 720 

TEVTPHAFTP SSRQQDLVST VKWYSQTTQ PVYHGETPLQ PSYSSBVFPI. VTPLDLDWQI 780 

40 I.MTTPAAS88 DSAIiHATFVF PSVPVQFESZ I>S6¥DGAP3i£> PFfiSA8F8SB IrFSHZiRXVSQ 840 

IIiPQVTSATB fiDXypUIASlt PVAaGDLI.LE PSUWQfYSDVI. STTHAASBIXi BFG8ESGVbY 900 

KTEiHFSQVHP P8SDAHHBAR SSGPEPaYAIi 8DI1E68QEIF TVSYSSAIPV HDSVOVTVOG 960 

SLFSGPSHIP rPKSBLITPT ASDI^SPTHftL 8GD9BWSGA8 SDSEFIiiliPiyr DGLTAU7I35 1020 

FVSVAEFTYT TSVF(S>DNKA LSKSEIIYGtr STELQIPSFN EMVYPGESTV KPKKyDSlVMK IQBO 

45 I^EtASLQETSV SXSSTRGKFP GSI>AHTTTKV FDBEISGfVPB KETFeVQPmT VSlQRBGDfrSh 1140 

KFVZ1SKH8EP AS8DPA3SEK LSPSTQLIiFY ETSAfiPGTEV LLQPSFQIW VEXTZJiKTVLP 1200 

AVPSDPIXaVE TPKVDXXS8T H13IIiXVGaff&A BSBNKI^ST9 VPVFDVSPTS aMH&ASLQQL 1260 

TZSyAGEKYB PVIiLKSBSSH CfWPSLTSKD ELFQTAlilLEX laQAHPPKGRH VFATPVIiSID 1320 

EPUJTLIKKL XHSDBILT8T KSSVTGKVFA GIPTVASDTF VSTDHSVPXQ WGHVAITAVS 13S0 

50 PHRDQSVT&T KU^FPSKHTfi BL8H8AKSDA GI>VGGGEDGD TI>DDGDDDDD DMSBGItSXH 1440 

KCHSCSSnUB SQEKVHNDSD THEHSLMDQir NPXSYSEiSBll SBSDHSSTSV fiSDSQTGHOR 15D0 

SPGKSPGANG LSQKRIflDGKE EHDIQTQ3AI* XiPIiBPBSKAH >VLtr8DBBSQ 90QOISDSUT 1560 

EH ETSTDF3F ADXa^SKDADQ XIAAGDS&IT PGFPQ8P?88 VTSEHSSVSH VBEABftSNSS 1620 

HESRIGIABG IjESBKKRVXP LVIVSALTFI CLWLVGU^I YHHKCPQTAH FYLBDSTSPR I6BO 

55 VXSTPPXPXF PISDDVGAIP IKEFPKHVAD ZJXftSSGFTEB FETIiKEFYQE VQSCTVDUSX 1740 

TKOSSNZZPDH SBKBERYIHIV AXPBSRVKLA QZAEKDCSQKLl? mnnAHyVDO YHEPKAYXAA 1800 

QGFXiKSTAED FHRtanESNV BVIVMIDMLV BHBKRKCDQy WPADGSEEYG HFLVTQKraVO 1B60 

VUVYYTVPSIF TIiItNTKXKKG SQKIGRPSGfiV VTQfyHyTQHP DMSVPKYSZiP VI^FVRKAAY 1920 

AKHHAVGPW VHCSASVGET GTYIVIJDSMIj QQIQHBGTVN IFGFLKHIRS aHNYr.VQTEB 1980 

60 QyVFIHDTIiV BAXI.SKETEV KJlSHlHAYVH AEiXiIPQPAGK TKLEKQFQZiL SQ6NIQQSDY 2040 
SAAUCQCMRE KN9TS9XIFV EitSRVOiaSli &GB6TDYTHA SSdMSYYQSH EFZITQHPliIt 

HTIIDDFWSHX WDBHAOLTVM IPDGQ?!lK2U8I> EFVyKPHKDB PXHCB&FICVT LMAEEmGOiB 2160 

MEBKLIIQDF XXJBATQDDYV IiEVBHFQCPK WPNPDSPXSK TSBUCSVIXE EAKNRDGPHI 2220 

^ VHDBHGQVTA GTFOWbTTLM HQLEKENSVD VYQVAKHUnt MRPGVFKDIB QVQPIiYKVXIi 22 BO 

D5 GLV&TKQEEer PSX&LDSNGA AIiPDGCIIAES I.E8DV 2315 

8eq ID 100 s C313 Protein Sftquenca 
Proteia AccGBBion <|z iaPJD02&42 

70 1 11 21 31 41 51 

I I I I I I 

KRIEilERFZiAC LQLtCVCRXD flANGYYKQQR KZiVEElGnST TQUArQKHWG XKifPTCll8PK 60 

Q8PZHIDEDL TQVKVSIZjKKIi KFQGWOKr&X. EHTFISMTOK TVBINI^TSIDY RVSGOV3EMV 120 

FKASKZTFHn GKCHHSSD08 &B8LEGQKFP IiEMQlYCFDA DRF8SFSBAV KGKGKLRALS 18 D 

75 XDFEVGTBBtT X>DFKAIID6V ESV8RFGKQA AU)PFXi>Lin> LPNCSTDlOnn YHGSErTSPFC 240 

TDTVDHXVFK DTVSXSESQZ> AVFCEVIOMQ QaGY^nHLHDY ZJQ^iSlFREQQY KFSRQVFSSY 300 

XOKEBIHBA7 CS8BPBNVQA PPEBTYTSLLV TWERFRVVm mESKFAVIiY QQLDGBDQTK 360 
HEFLTDGYQD LGAXUniI£LP 2IM8YVI1QIVA XCTtSGLYGKY SDQLIVDMPT DNPELDZ^PB 420 

IjIGTEBZXKB EGEGKDIEEG AXWPGRDSA TNQXllKXEPQ XSTTIHYNRI OTinnilEAXTN 480 

80 RSPTHGSEFS GKGDVPiaTSli USTSQPVTKli AXEKDISUTS QTVTELPPHT VEGTSASUffD 540 
GSKTVLRSPH MHIiSGTAE:SIi HTVSXTBYEE SSIiLTfiFKLD TOAKDSaaSS PAT8A3lPPXS 600 
EmGQGYIFS 3ENPBTITYD VXiXPB8ARl)IA SKDSTSSGSB E8L1CDP&MEG KVnFPSfiTDl 660 
TAQPDVGSGR BSPIiQTlffYTB IRVDESEKTT K3F8AGPVH9 OGP8VTDI1EM PHYSTFAYFP 720 
TSVTPHKFTP SSRQGfDXiVST VElWYSgTTQ TVYNAEASNS SHBSKIOIAB GLESBiCKAVX 7BD 
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PLVIV9A!>TF JCCLWltVSZL XYMRKCFQTA HFXIiHDSTSP RVlSTPFTPl FPISDDVOftl B40 

PIKHFPKHVA DLHftSSGFTB BFETLKEFYQ EVQSCTVDIiG XTADSSHRFD HKHKSKYIMI 900 

VAYDBSRVKI^ AQLAEKDGKL TTrYINANYVD GYWRPKRYIA AQ6P1jK8TAB DPWRMIWBHN 960 

VEVIVMlXNIi VEKQRHKCDQ YWPADGSEEY GNFLVT^KSV QVUOrTTVBN PTWUPTKIKK 1020 

3 65QHBRP9GR WTQyHYTQW POMGVPB^Gl. PVXfTFVRKAA. WHCSAGV6R 108l> 

TGTY2VljiD5M liQQlCRBSXV NXPGFLXHIR SQRNyZ^TE EQyVFIHDTI. VEAIIiSKETB 1140 

VU>3HIHAYV 13%LI.XPGPAO KTKLBKOPQL L&QaHIQQSD YSAAItKQOIR EKNRT8SIIP 1300 

VERSRVGIS3 LSQEOTDYIN ASYIMGYYQ8 NEPtZTQHPl> LHTXKDFnBM IY9DI1HAQLW X2 60 

MIPDGC3NMAE DEPVYWPKKD EPINCESFKV TLMAEEHKCL SNEHKLUQD PII»EATQDDT 1320 

10 VXiB\nRHFQCP KWPHPDSPZS KTFHLXSVIK BEA»igRDGFM IVHDEHGGVT AGTFCALTTL 1380 

MHQIiBKESSV 0VYQVAKHIN LMRPGVFADI EQYQPIilfKVl LfiLVSTRQEE NPSTSUiStaO 1440 

AATfPDONIAE SUBSIAT 1456 

- — Se<3 ID NOt C314 Protein Seqa^ce 
15 Protein Accession #: Eos segjuence 

1 11 21 31 41 51 

I 1 1 ' I I ! 

MaiLKflFLAC IQLLCVCRLD WANGyYRQQR KLVEHIQWSY TQZMJilQKSHG KKYPTCNSPK 60 

20 QSFINZDBDL TQVKVNIiKKI. KFQQHDKTSZ. EMTeiHHTGK TVBIULTHDY KVSGGVBQW 120 

KVPKASKITTF BHCSCCrtMSSD GGESSLBGQK FPIiEHQiyCF DADRFSSFEB AVKGRGKLRA 180 

LSILFEVGTB BHU2FKAIID QVElSVfiRFGK OAALDPFILL HLLFHSTHK? YXinaQSLTSP 240 

PCnDTDHIVF KDTVSIfiESQ lAVPCEVIiTM QQSOYVMmD YLC2KBFRBQQ YKFSRQfVFSS 300 

yTGKEEIHEA VCSSEPENVQ ADPENVT6LL VTMERPEWY DTMIEKPAVI* YiQQLEJaEDQT 360 

25 KBEFLO^DCSyQ DLGAILHNLL PHHSYVrOIV AICINQZiYGK YGDQZ,IVnMP TDNFBLDLFP 430 

SbXGT^IIK EBBEGBDIBB GAIVHPCSRDS AINQIRKXEP QXSTTTSXHR XOTmKSnlCr 4B0 

NR&PTRG8EF SGKGDVPXaTS UISTSQFVTK I1ATBKDISI.T &QTVTELPFH TVEGTSTkSCH 540 

IXISKTVIiKSP HMNIiSGTASS LHTVSITEyS SBSWT9FKI. DTC3AEDS&GG SPATSAIPFX 600 

eEMISQGYIF SSEWPETITY DVIjIPESAHN AfiEnSTSSGe EESLKDP8ME GMfVWFPaSTD €60 

30 ITAQPOVGSG HESFliQIKYT EXRVDE8EKT TKSFSAGFVH SQQPSVTDIjB MPHY&TFAYF 720 

PTBVTPEAFT PSSRQQDZiVS TVHWySQIT QPVmSABHS SHESRXGIAB GLBSEKKATX 7 BO 

P1.VIVBAI.TF XCLWLVGII. lYnRKCFQTA RFZLBDSTSP RVISTPPrTPI FPXSDDVGAI 840 

PIKHFPKKVA DLHASSGFTB BPBTLKEFyQ EVQSCTVDLG XTADSSNHPD WKHKNRYIHI 900 

VAYXnarSRVKL AQItAEKOGKL IXiyiNANYVD GYNRPKAYTA AQaPLKSTAB DFHHMIWEaCI 560 

35 VEVXVMXTNL VEXSBCRKCDQ YHPADGSEBY ONFliVTQKBV QVLAYYTVRN FTXANTKIKK 1020 

GSQKSRPSGR WTQIfHVTaM PDMGVPBySI* PVJUTFVRKAA YAKRHAVaPV WHC&AGVGR 1080 

TOTYIVLDSM LQQIQBEGTV NXFG(P£JKHIR GQEam.VQTB EQYVFIHDTIj VBAILSKBTB 1140 

VliDSHIHAYV HAIiLXPGPAG BTKLBECQFQL LSQSEIXQQSD YSAAl^CHR EKIlRn?S8IIP 1200 

VERGRVGISG IiSGEGTDYJN ASYIMGYYgS I^lXTQEPIi LHTIKDFHRH IViDHNAQLVV 1260 

40 HLPDOONMAE SEFVYWPHKD BPXNGBSSKV TLMAEESHCD SZBBBKIiXlOD FiXiEATQDDY 1320 

VLBVRBFQCP KNENFDSPZS KTFEDISVXK BERIUiREGFM ZVHDEBGGVT AGTFCAIiTTI* 1380 

HUgiLBKEIirSV rFVYQVAKMlK LMRPGVFADI EQyQFIiYKVZ XiSXiTSTHQBB HPSTfiLDSWe 1440 

AAI.PDSNIAB SLESLV 1456 

45 5eq TD NO: C315 Protein Sequence 
Protein Accesaion fii Bos seq^eIXCe 

1 11 21 31 41 51 

^r. \ I 1 1 I I 

DU (IRXIiKRFIiAC JQ&LCVCRIJ> MANGYYRQQR KLVEBXGWSY TGAIWKNWQ KKYPTC2ISPK 60 

Q8PI1IIDEDL TQVJSmaXKL KFQGKDBTSEi ENTFZHNTGK TVB£HI.1XIDY RV6GGVSEKV 120 

FKASKZTFHH GKCKHSSDGS EBSlaEGQSCFP I»BMQ1YCF13A DRFSSFEBAT lOaHSKLRALS IBO 

XltFEVGrEEN* XJDFKAXXDGV BSVSRSGKQ^ AU^PFlLUflt LPHSTDKYYI YNGS&TSPPC 240 

TDTVDWXVFK XTTV^XGESOL AWCEVZ.TMQ QSGYVHLMDY I^QEIHFREQQY KFSRQVFSSY 300 

55 TGKEEXHBAV CSSBPEHVQA DPEHYTGIiIiV TBIBRPRWYD TMXEKFAVS»Y QQmGBDQTK 360 

HSFUEDBYQD LGAIUOEaULP UM8YVUQXVA XCTNGIiYGKY SDQLIVDUPT DKPBLDItF^ 430 

LXGISBZIKS EEBGKDIEB6 AXVUPSKDSA TNQZRKKEPQ I8TT7KS3ERX GTSWEKSOXI 4B0 

RSPTBG9EFS GIQSDVPNTSli SiGTSQPVTKIi ATBECDISIiTa <;fTVl'BIiPPHT VBGTSASIiND 540 

GSKTVIiRSFH HNLSGTAESb TirVSlTEYEE SGtiLTSFKLD TGAEDS9G3S PATfiAXPFXG 600 

60 E»XSQGYXFS GSilPBTXTYD VLIPBSASKA SEDSTfiSGGB BSLKDPSMBG MVHFP8STDX 660 

ZAQfiOVGGGS ESFLQXHYTB IILVDEGSKTSf K&FSASFVK3 QGPSVTDIjSC FHYGTFAYFP 720 

TKVTFHAFTP S8IRQQDLVST VNVWSQTTQ PVYSiBASH&S iusokIGIiABQ X»B3EKXAV1P 7 BO 

IiVIVSALTFI CLVVLVGlLt YHRICCPQTAH FYLEIDSTSPR VISTPFCTIF PISDEWSAIP 840 

IKHFPKHVAD tEASSGPTKE FETI.KEFYQE VQSCTVDLGI TADSSNHPDH KHKHRYlMlV 900 

05 Ayi}HSRVKI*A QliAEECDGiajT DYXBANYVDG YNRPKAYXAA QGPIKSTABD FHRHinBHHV 960 

EVIVMITNLV 6K19ltRKa>QY HPADGSEETYG HFLVTQRSVQ VIAYYTVBNF TLSaaTKXXSQGp 1020 

90lDaRP8l3RV VTQmrSQnP SHSVPEYSLP TLTFVRKAAY AKBHnyGPW VHCaAOVQRT lOQO 

6TYlVIJ>SHr. QQIQKBC3TVN XFGFLKHXRG QRHYLVDTHB OYVFlBDlMliV BAILSKETEV 1140 

UJSHIHAYVN ALLIPGPAGK TKLEKQFQGL TLBPSIOCBG TISAHGNLPL PGI.TDPPT5A 1200 

70 BRVAGTXXiliG QSETXQQSDY9 AAI.KQCNREK MRTSGIZPVB RSRVGXSSLS OEGXXIYXNA5 1260 

YXMSyYQSNB FXITf^EIPIiUf TXKCS?HRHXV XXENAQIiWMI FDGQHMhEDB FVYWPKKDSP 1320 

IKGBSEKVTIi »AEBHKCEiGN &&KLIXQDFI LEATQpDYVL BVHHFQCPKft PNPDSPISXT 1300 

PBItlSVXKEB AAtgSQDGPMTV HDEHOGVTAG TFCAIiTTZMH QEiBKEHSVDV YOVABHUOLM 1440 

„^ RPGVFADXBQ YQFLYKVI1«S LVGTRQEOIP 8TfiU>8NGAA LPD^nCABSl. BSLV 1494 

75 

Seg ZD NOs C316 Protein Sequence 
Protein Accesfllon #s Bos seguenoe 

1 11 21 31 41 51 

80 I 1 I ] I I 

MRH.KRFI1AC IQIJiCVCRLD WANGYYHQQR KliVEBIGHSY TGALNQKNWG KKYPTCHSPK 60 
QSPXNXDEDI* TQVHVNXjKKI. KFQGHDKTSI. EKTPIHUTGK TVElMLUnTY RV6GGVSEMV 12 Q 
FKAGKITFHH GKCamSSZXSS BBSLBGQKFP IiEKQXYCFDA DRFSSFEEAV KSnBKI«RA£iS 180 
XX^FBVOTBEH LDPKAXXDGV B5VGRFGKQA AXJSPPXIiLHli LGMSTDlonrX YNGSUT9PPC 240 
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TDTVDWIVFK mVSZaBSQU AVFCBVZAMQ QSOYVMLMDY LQmiFREGQy KF6RQVFSSY 300 

T&KBEIHBAV C83EFEMVOA DPEHTTTeLLV THERSRWYD THIEKFAVLV QQLDaEDQTK 360 

HEFX^TDGYQD IiaAIIiNNI.LP KMSYVLQIVA ICTHOIiYGKY SDQLIVDMPT DHPEASNSSR 420 

ESRIQI<AEaii ESEKKAVlPlj VIVSALTFIC LWIiVGILIY HRKCFQTAHF YI.EDSTSPRV 480 

5 ISTPPTPIFP ISDDVOAZPl KHFPKHVADD HASSGFTESF JBTI*KBFYQ&V QSCTVDLGIT 540 

ADSSMHFDHK HKHRYISirVA YDHSRVKLAQ lAEKDGKLTD YIHMfWDGY NRPKAYIAAQ €00 

GPLKSTAEDF flRMIWEHHVB VXVMITNIiVE KGRRKCDQYW PADG3EB!rGH FLVTQKSVQV 660 

lAYYTVHNFT LHNTKIKKOS QKSRPSQaW TQVHYTQWPD MOVPEYSLFV LTFVRKAAYA 720 

KRHAVGPWV HCfiAGVGRTe TYIVLDSMIO QIQHBGTVNI PC5FLKHIRSQ BNYLVQTBBQ 780 

10 YVFIHiyTLVE AXIiSKBTEVIi SSHIHAYVHA LLIFGPAGICr KLEKQEQhl^ 0SHIQQ81>yS 640 

2UlZfKQC3IREK NRTS&IZPVE SSRVGISSZiS GEQTDYUKAS YIMGYYQSMB FIITQHPLIJI 900 

TIKDEHBHIW SHSTAQIiVVMI EDGQHMASDB FVYHINKDEP XNC^SFKVTIi MAEEBKCLSH 960 

B&ICX>IiaDFI CEATQDDYVL EVRHFQCPICW PNPDSPIfiECT FEliISVUOSB AAIIIRDGPHXV 1020 

HDEHGGVTAG TFCAI,nrTX.MH QLHKEH5VZ3V YQVAKMINLH RP6VFAD1EQ YQEIiYKVIUS 1080 

IS IiVSTRQEEEIP ST&U3SHGAA LPDGniAESL BSIi 1113 

5eq ID HO I C317 Protein Saquezice 
Protein Accession Bob sequence 

20 1 11 21 31 41 51 

I ! I I I I 

MRlLKaPLAC IQLLCVCRU) HANSYYRQQR KIiVEElGttSY TSKUfOKSHG KKYPTCErSFK 60 

OSPmiHEDIi TCfVETVNljKKIj KFQGKOKTSL EtTrFrHNTGK TVElNliTXIDY RVSGGVSEMV 120 

PKAfiKITEHW GKCNR33DQS EHSLEC3QKFP liSMQIYGFDA DRFSSFEEAV KGKQKLRALS 180 

25 lUPEVGTBECI LDFKAIIDGV BSVSRFGHQA AIiDPFZLUHi LtUfiTDKYYI YNGSLTSPPC 240 

XDTVDNIVFK PTVSZSESQX^ AVFCBVUTMC} Q8GYVMU1DY XKmHFRBaQY KPBBOVF&SY 300 

TGKBEIBBAV G8SEPSHVQA DPENYTSIiLV THERPAWSfD TMIEKFAVIsY QQITCEDQTR 360 

HBFLTDGYQD [XJAIIiHNIiDP NM8YVLQ3VA ICTHGCiYGlCY 9DQI>IVDKPT DHPELDI^PPB 420 

LIGTHBirKE EBBGKDIEBG AJVNPOHDSA TNQ3RKKSPQ ISTTTHYHRI GTKXNBRKTW 480 

30 ItSPTRGSEFS QKQDVPKT£I» »8TSQ7VTKZ* ATBKDISZiTS OrVTBLPPIfr VEGTSAEIjSID 540 

GSKTVIASPH MNLSGTAESIi HT79I1SYES BSLLTCFKLD TSVEDSSGSS PATSAZPFZS 600 

EKISQGYZFS SBHPBrmD VLIPESARSn SEDSTSSOSB ESLKDP8HEG NVffFPSSTDI 660 

TAQPDVGSGR BePLQTMYTE IRVDESEiCrT KSPaAaPVM& QGPSVTDLEM PHYSTPAYFP 720 

TEVTPHAFIP ggEQOaLVST VNWYSQTTQ PVYNEASWSS HESRIOIinSQ LESEKKAVIP 780 

35 I.VXVSALTFI CLWLVGILX YKRKCFQTAE FYDEDSTSPR VZSTPPTPIP PXSDDVGAXP 840 

I K H RPKHVaD U»5S(3FTEE »BXL1CBFYQ|B VQeCIVDX.QI TKDSSIIHFJIH KHKHaYINXV 900 

AHDHGRVKLA QIiAEKDGKLT DYXHAHYVDQ YNRPKAYIAA QGPLICSTASD FnRHXWEEIHV 960 

EVIVMITSIiV ERSRRKCDQY KPADG6EBYG KFLVTQKSVQ VUVYYTVRNF TUSKl^KIKKG 1020 

SQKGHPSGRV VTQYKYTQWP DHGVPEYSIiP VLTFVRKAAY AKHHAVGFW YHCSAGVGBT 1080 

40 GTYXVLOSNIj QDiaHBGTVN IFGFLEHIRS QBHYLVQTEE QYVFIBDTEiV EAILSKSTBV 1140 

XJISHIBAYW AIjLXFGP^UGK TKCiEXQFQQL TSjSBBISCBSS TISMXCBOM raiTDPFTSA 1200 

8RVARTXIti:.S QSHXQQ5DYS AAIiKQCUEEK SRTSSIIPVB RSRTOlSSIiS GEGrDYZHAS 1260 

YXHSYYQ&SIE PIITQHPUJI TIKDFWRHI17 OHNAQLWMI PDGQ13HASDE FVYWFHKDBP 1320 

rHCESFKVTD MAEEHKCIiSH EEKIjZXQ13FI LEATQC0YVL EVRHFQCPKW PHPDGPXSKX 13 BO 

45 FBItrSVIXEB AIWBDQPKIV ROBHOGVTAG TFCAUTTI<HS QX£KBH8VDV YQVAKHINLM 1440 

BPGVPADIEQ YQEIiYKVIX^S XjVSXRQEBBIP STfiLDSHGAA LPDGNIAESb BSIi 1493 

6eq ID MO: C318 Protein Seqizemde 
Protein AccesBion 41 s Bos segaence 

1 11 21 31 41 51 

I I 1 i I ]' 

KRJLKRFZiAjC IQLLCVaUiD WAHGYYRQQR KLVBBXGi7SY 3X3AIJI1QXHHG KKYPTCNSPK €0 

QSPINIDEDIi TQfVNVNUaaj KFQGKDKTSL ESTFIHIITGK TVEINIiTHDY RVSGQV&SMV 120 

55 FKA9KIIFHH GKCamSBDGS BHSXiEGQlCFP ZiStQlYCEDA DRFS8FEEAV K6KGKi:iBAI.g 180 

ilfbvgtbbh udfkaxxx^ bsverfgrqa xuxeFivvmM uassrnssra hsgsltsppc 240 

TS3TVDRZVFK DXV&ISE5QI* AVTCSVIjTMQ QSQZVMIiHETY IdQHSlFRSBQQy KFSRQVPS3Y 300 

TGKEEIUEAV C9SEFSSIV0A SFSHYTSLLV TnERPRWYO T^XEEPAVIjy QOU3GSDQTK 360 

HSFMBGYQD l/3AItNNLlaP MMSYVLQrVA ICTNOLYGKY SDQLIVDMPT imPBLDIiFPB 420 

60 IiIGTBSXXKE EBEaKDiEBS AIVigP(aU>SA ZNQIRKKBPQ ISTTTOZIIRI GTK^CKEAKTN 480 

RSFTRGSSFS GKGDVPVTPSIa VSTSC^VlKL ATBXDX&IiTS QTVTBIiPPHT VE0T5ASUD) 540 

QSKTVZAePH KHLSGrhSSL HTV8ITEYBB BSI>1UTSFKCD TOASDSGGSS PATEAIPPIS 600 

SNXSQGYXFS S^IPETITYD VIjIFE&ARHA SEt>8i;886SE ESKJO^SMEG NVNFPSSTDl 660 

^ TAQPDVGfiGR ESFLQTKYTB IKVDESBKTT KSFSA^FVHS QGFSVTDLrai PHYSTFAYFP 720 

Q5 TEVTPBAFTP SSSQQDItVSX VSlWYSQrrQ PVYNEASN8S HBSRXGIABG liBSEKKAVIP 7 SO 

XjVIVSALTFX CLV5niVGII.I YHRXCFQWa FYLEDSTSBR VXS3!ePTPXP PISES7VGAXP 840 

TKRFPKHVAD liRASSGFTHB FBTLrKBFYQB VQ6CTVDL6I TftDSSEiSEIXf KHKZIRyXlflV 900 

AYDHSRVKLA OLABKDGKLT DYIKANYVDO YKRPKAYXAA OGPLKSXAED PKRMIKBHHV 960 

EVIVMlTIIIAr EKORRKCDQY HPADGfiEEJffi KFLVa^QKSVQ VLAYYTVENF TLHKTKIKKG 1020 

70 SQEGQRPSC3RV VTQfYEYTQWP DHGVPEYSIrP VLTFVRKAAY AKRHAVaPW VHCfiAGVGRT 1080 

OTYZVLDSML QQIQHBQIVN IFGFIdCHIRS QRBnffiVQT&B QYVPIHDTbV BAIXiSKBTEV 1140 

IiDSHniKYVN AZJilPOPAlBK TKZ.BRQPQUL 6Q6NIQQ8I>Y SAAUDQCMDKB KHRTSSIIPV 1200 

ERSRVGIS8L SGEGTDYINA SYXMGYYQSK BFirTQHPZ^ HTIKDmMI nUBVCAQIfWM 12 60 

IPDOOHHAED EFVY^PHKDE PmCBS^t^JTi U4AEEHKCLS HEBKLZIODF IX^ATQAWRS 1320 

75 DGRKFLCSDN PYAFTRKRKF RGCLPGSQDD QSDEARSLC 1359 

Seq ID NOi C31S Protein Sequexice 
Protein Accession M-, XP_0D2914.4 

80 1 11 21 31 41 51 

1 I I I I t 

HKDXDIGKEY JJPSESYRSV RBRTSTS61H RDRBD8ECFRR TRBImBC&SMi ETAARABGI>S 60 

X^IASMHSQLR XU>SEaPKBK YBHGXiSAlKP laTTSKHOHP VEWnOLFSOl TFSHLGSLAR 120 

V^OKKGBXiSH EEIVHSLSKHB SSDVNCRSLE IU.WQBEUIEV m»ASUmV VHIFC3n!RXtZ IBO 
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liSIVCLMITQ U>IGFS6PAFM VKHI/LEyTQA TESHLQYSLL LVLGLLLTEI VRSWSIALTW 240 

ALHlCaTGVRI* RGJUliTMAFK KILIOJCNXKE KSIjGELTHIC SNDGQRMPBA AAVGSIiIiAQG 300 

PWailiQHlY KVIILC3PTGP USSKVFX'hPY PAMMFASRliT AYFRRKCVAA TDESLVQ^NSB 360 

VLTYIKFIKM YAHVKAFSQS VQKISEEBRB ZLBKAGYFQ3 ITVGVAPZW VIASWTFSV 420 

5 HMTI^FDUTA AQAPTWTVF KSHTFALKVT PFSVKSLSEA SVAVDRFKSZ> FIiMSBVHMIK 4 HO 

NKPASPHIKI EMKMATLAMD SBHSgiQHSP KLTPXMKKDK RA8RGKKHKV RQLQRTEHQA S4.0 

VLABQKGHLL DDSDERPSPB BEEGKHIHLS HLRLQRTLHS IDLBIQBGKL VGICQSVGSG 600 

KTSLISAIIiG QMTIiIiEGSIA I6GTFAYVAQ QAniUIATUl DHILF6KEYD EBRYH&VIJfS 660 

CCliRPDLAIL P88DLTEIGE ROAKLSGGQR QRISLARALY SDRSTYZLDD PL8KU>ABVG 720 

10 NHIFNSAiaK HLKSICrVIfFV THQLQVliVDC DBVIFMKSGC XTERGTHSSL MKIiNGDYATI 7 BO 

FIOKIiLliGETP PVEINSKKET SQ5Q2CKGQDK GPKTGSVEKS KAVKPBEGQIi VQLBEKSQ6S 840 

VPWSVYGVyr QAAGQPIiRFIj VIMALFHLNV GSTAFSrimL STfWIXQGBGM TTVTROJETS 900 

VSDSMKDNPH MQYYASIYAL SKAVMtlLKA IRGVVFVKQT LHASSRLKDB LPRRILRfiPM 960 

KPFUTTETGR lUIRPSKDHD EVDVRIiPFQA EMPl<»IVIbV FFCVOMIRGV PPHFLVAVOP 1020 

15 I-VXIiFSVLHI V£RVI.IREUC RZJ)NITQGPF liSHITSSlQO XjATIHAYNKQ QEFLHRYQEIi 1080 

LDDMQAPPFt* PTCANHWIjAV RH>LISTAI>I TTTGIiMIVLM HGQIPPAYAG LAISYAVQUr 1140 

GUiFQFTVRIiA SBTEARFTSV ERIKHYIKTL SUSAPARIKH KAPSPDWPQE QBVTPENAEM 120 D 

AYRSMBPLVli KKVSFTIKPK EKIGrVGRTQ SGKSSLGMAL FRLVELSGCC IKIDGVRISD 1260 

IGEiADUeSKL SIIPQBPVIiP SQTVRSNIDP FHOfTTEDOlW DALERTHMKE GIAQLPLKLE 1320 

20 QHVKSSiGiaStP SVGERQUiCZ ARAUtRHCO I.ILDEATAAM DTGTDZrUQS TIREAFADCT 1380 

MLTIAHRIiHT VIiGfiDRIMVEi AQGQWEFST PSVLLSNDSS BFYAHFAAAE NKVAVKS 1437 

Seg ID NOf C320 Protein Sequeooe 
2^ Protein Accession ft: NP_005 675.1 

1 11 21 31 41 51 

I I I I I I 

MKDIDIGKKY IIPSPGYRSV SBRTSTSQTH RDREDSKPRR TBPLBOQpDAL STAARABGZsB 60 

LDASIi'lHSQIiR XLDEBHPKGK YHEGLSALKP 1RTT6KHQHP VDNAGLFSCH ITPSHLSSIjAR 120 

30 VAHKKGBLSM E:dW6I>8KH]5 SSDVHCRRLB RLf7QBEUIBV GFDAASLRRV VWXFCaCTRLl IBO 

I.SXVCLMITQ UWSFSOPABM VKHLLEXTQA TESNLQYeUi i;VLGI»LDTfi3 VRSWSLAMW 240 

ALNYRTGVRIi RGAILTMAFK KILKI^KNTIKE IC5XJ3EI.IHIC &NnOQRMPEA AAVOSLLAGG 300 

PWAZIiGMIY NVIILGPTQF IiG8AV?I]jFY PAMMFASRI/T AYFSRKCVAA TD&RVQKMNS 360 

VLTYIKFIKM YAWVKAPSQS VQKJRB5SRR ILEKAGYPQQ ITVGVAPIW VIASWTFSV 420 

35 HMTLQFDLTA AQAFTWTVF NSMTPAIiTCVT PPSVKSLeKA SVAVDRPKSL PIMEEVBMIK 480 

KKPASPHIKl EMKHATLAVrD SSH3SIQKSP KLTFKMKKDK RA&RGKKEKV RQLQRTEHQA 540 

VIjAEQKGHIjL LDSDERPSPB EEBGKHIHLO HI^OJOR.'nmS XXfLEIQEGaCb VGICGSVGSG 600 

KTfiLIS7UZJ3 Qfrn'IiIi^SZA I9GTFAYVAQ Q^VWII^NATIiR DHIIiFGKEYD EERYNSVUilS 660 

CdaRPDIiML PS&DZtTSXQE BiGAKLSGGQR QRlSEiARALY SDRSIYIUOD PliBALDAHVG 720 

40 MHIENSAJRK HDKSrrVIiPV THQLQYI.VDC DEVIFMKEGC ITBRSIHEEL MNIiNHDYATI 780 

FZnai^IitiGETP PVEIHSKECBT SQSQKKSCPK GP1CIG8VKKB KKTKPEEGQIj VQLSBKGQGS B40 

VPWSVYGVyi QAAGGFLAFIi VIMALFHI^ GSTAFSTWWXi 5YWJKQG8GN TTVTRONST& 900 

VSDSMKDNPH MQYYASJYAE* SMAVHLILKA XRGWFVHGT LRASSRIiHDE I.FRRILSSPK 960 

KFPDTTPTGR ILNRpeKDMD EVDVRLPPQA KMFIQHVII»V FPCVQHIAGV FPNF&VAVC3P 1020 

45 I^VIIiPSVX^HI V5RVLIRELK RliDNXTQSPF &SHIT3SIQG XJl^rXHAYNRS QEFI.HRYQBL 108Q 

IJ3DNQAPFFL FTCAMRHIiAV RIiDLISIMiX TTXGIMIVLM HQQXPPAYAG LAISYAVC^T 1140 

GliFQFTVRIiA SSTEARFTSV ERINHYnCITi BLSAPARXKN KAESPDMPQE (3SVTFENAEM 1200 

RYR£NLPLVI) KKVSFTIKPIC EKIGXV^RTQ SGK&SLGMAI. FRLVELSGGG Z1CIDGVRXSD 1260 

laLADDRSKL SllPQEPVIiP SGTVHfiKLDP FKQXTEDQIH DALBRTHHKB CXAQLPIiKLB 1320 

50 SEVMENGT^KP SVGERQliLCI ARA&UtHCKI 1.XIJ7EATAAH DTETDUiIQE TIREAFWir 1380 

^TIAHRI^ET VLGSDRXHVIt AQGQWBEDT PSVLXiSNDSS RFYAMFAAAE MXVAVKS 1437 

Seq m NOt C321 Protein Sequence 
Protein Acceflsicm NP_0Q5553.1 

1 11 21 31 41 51 

1 I I I i I 

MPALmiOCXZi CPfiLLLPAAR ATJ3RREVC31C HGKSSQCIFD RELaSiQ!TG»6 FRWKJSDSFV 60 

DGZBCEKCKN GPYRHRBBDR CLPC9ICN8KG SLSARCOHSG RCBCKPOVTG AKCDRCLPGF 120 

60 HVdiTIlAQGZTCZ DQRLLDSKCD OJPAaXAGPC 23AGRCVCKPA VTG&RiCDRCR SGYYVUDGGH 180 

PBGCTQCFCY GHSASCR£&A EY8VEKXTST FHODVDGHKA VQRNGSPAKL QWSQRBQDVF 240 

SSAQRIiDFVY FVAPAKPX^ QOVfiYGQSLS FDYRVDRGGR HPSAEIDVZIiB GAiSLRITAPL 30 D 

MPLGKTLP06 I>TKIYTCFBXN EHPSHHRSFQ ZjSYFSYBREJj RNIAALRIRA TYGEYSTGYZ 360 

DHVUiXSAStP VBGKPAFRVE QClCPVGnQS QFCDPCSkSGY KKDSflRIiOPF GTCIPCNOQQ 420 

65 GGACDPDTQD CZSGDEfilPDI ECADCPICPY HDPBDPRSCK PCFCHIIGFSC SVHPETEEW 480 

CEOJCPPGVTG JUlCBW3U)Gy PGDPPGHH6P VRPCQPCQCUr HHVDPSASGH CDRLTGRCLB: S40 
CIHNTAGIYC DQCKftGYPGD PLAEHPADBC RACNOffPMGS BPVGCRSDGT CVCKPGPGGP 600 
HCEHGAFSCP ACYIIQVKXW DQFMQQLQRM SAIiISKAQGG DGWPDTBIiB GRMQQABQAIi 660 
QDILRDAQIS EQASRSIiGLQ LAXVRSQEKS YQSRIjDD1»KM TVBRVRAIiGfi QXQHKVRDllI 720 

70 RI>ITQHQXiSIi AES£ASLGKT HXPASdHWG PNGFKSIAjQB ATBLAESHVB SASNMBQLTR 780 
El^EDYSKQAIk SLVRKAXiHEG VGSGSGSPDG AVVQGI>VEKL EKTKSIAQQL TREATQABIE 840 
ADRSYQHSWl LIiDSVSRLQG VSDQSFOVEE AKRIRQKASS LSTLVTRHMD BFKRTQKNLG 900 
KWKEEAQQLL QNGKSORSKS DQIiLSRfilSIA KSRAQBALSH GNATFYEVSS ILKNLREFDIj 960 
QVDNRKAEAB EAMKRiiSYXS QKV5DASDXT QQASRALOSA AADAQRAIOHG AliEAIiBlSSB 1020 

75 IEQBIGSZiHIj BA19VTASI3AI> AMEaCGLASLK SaSHRBtfBGEI* ERKEIiBFZXEN MDAVQMVXTB 1080 
AQKVDTRAKd AGVTlQpILN TXTOUjHIMD QPLSVDiBSGb VZilEOKIiSRA KTQINSQLBP 1140 
MMSBIiEBRAR QQGUSHI^HUiB TSIDGIUUDV KSIUEHIRnilL PPGCXHTC2AXi EQQ 1193 

Seq ID NOs C322 Protein Seqiience 
oU Protein Acceafiion #: 1IP_066924.1 

1 11 21 31 41 51 

I I I I i 1 

MunoLQUis piuiF»3fa:s aivsiawom Kixstotemi vhushkegi. nuecvsQaro fio 
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QIQCKVFDSL IiNLSSTLQAT RAUWVGri*!. GVIAlFVATV BtWOtHSOCLED SEVQKHRMAV 120 

ItJGATFLTAG LAIIjVATAWY GtlRlVQBFYD PMTPVHARYE FGQAZfTGmA AASliCLiLOQA. 180 

LLCCSCPSUCr TSYPTPaPYP KPAPSSGKDY V 211 

5 seq ID NOs C323 Protein Sequence 
Protein Access ion AAM7787G 

1 11 21 31 41 51 

lU MSSHIHHHGP AMARLWGFCW I.WGFMRAAF ACPTSC3KCSA 8R1HCSDPSP OIVRPPRLEP 60 

MTSVDPBH'ITB IPIANQKRLfi IINEDDVEAY VGLRNLTIVD SGLKPVAHKA FLiCNSiaiLQHX 120 

NPTRHKI»TSI. SRKHPRHIJ5L SELIIiVG«PP TC8CDIMHIK TLQKJVKSSPD TQDLYCUIES 180 

SKNIPLASHiQ IPHCOIfSAN LhAPtiniiTVEB QKSXTEiGCSV AGDFVPKMmt JTVaCTLVSKHH 240 

NBTSHTQGSL RITHISSODS GKQXSCVASK IjVGBDQPSVN IOVHBAPTIT FLBSPTSDHH 300 

15 HCXPFTVIDGN PKPAIdQNFm GAILNSSmT CTKIHVTMET EYHGCLQLDlt PTHHiaHGafyT 360 

IiXAKNEXQKD SKQISAHFHG W90IDDG«HP NYEDVIYBDY GTAAHDIGDT TiaRSHBEFST 420 

rvTDKTGREH LSVYAWVIA SWGFCLZaVM LFIiUCLhSHS KFGMKQFVEiF BKIPLDG 477 

Seq ID NO: C324 Protein Sequence 
20 Protein Accession ft; NP_Q06171,1 

1 11 21 31 41 51 

111111 

MSSWIRWHGP AKf\RL.K6?CH I»WOFHRAAF ACPTSOQCSA GRIWC8DPSP OIVAFFRUBP 60 

25 NSVDP£»ITE IFIAHOKSliG II»EDDVX!AY ^VGUOILTIVD 8G][aKFVAHKA FLKNBNXCTI 120 

NFTHHKI/rSlj fiaKKFHHLDIi SELILVGNPF TCSCDIMWIK TLQEAKSSPP TQDIiYCLNSS IBD 

SKWIPLANIiQ IPNCGLPfiAK LAAPNliTVEE GKSITLSCSV AODPVniMYW DVGNI1V8KHM 240 

NETSHTQDSL RITNISSDDS OKQISCVSUBiti ItVOEDQDSVN I>TVHFAPT£T FItSSPlSDHH 300 

WCIPFTVKOTT PKPALQWFYN GAILNESKYI GTlOHVnillT BYSGCLQim PTBMMMQDYT 360 

30 IiIAKHEYGKD EKOISAEFMQ KPGIUDGAHP HYPDVIYEDY GTAAEIDI8DT TSIR£HBIPST 420 

DVTJWCECS«BH IiSVYAVWlA SWQFCliIiVM LFLLKIARUS KF6HKGPASV X8HEDDSASP 480 

UBHIBNGfiMT P8SSEGOFDA VIIGMTKIPV IKHPQYFGIT NSQMCPDTPV QHIKRHNXVIi 540 

KRELGEGAFa K^7FIiA£CY£II> CPEQDKII^VA VKTliKDASPN ARKDFHREAB LLTNI^HBHI 600 

VKFYOVCVBG DPLIMVFHYM KHGDKinCFlTlt AHGPDAVIMA BGNPPTEIiTQ SCSHCHIAQQI 660 

35 AftGMVYLASQ HFVHRDliATR £ICIiVX3BHLIjV KIODFGKSBD VYSTDYYRVG GRTNIfPmHM 720 

PPE9IMYKKF TTESdVHaiiG WXiWBIFTYG BQIPHYCL&iaB EVIBCXTQOR lO^SftPRTCPQ 700 

EVYBLHLGCM QRBBHMRKHI nSZHTUiQIII. AKftSFVYUSX Irf3 822 

fieq ID KO: C32S Protein Sequence 
40 Protein Accession fti Bos eequence 

1 11 ai 31 41 51 

1111)1 

KSSWIEWHaP AMARLWGPCH LVTOFWRAAF AXTPTSCKCSA fiRlNGSDPSF OJVAFPKXJSP SO 

45 KSVDFEHITE IFXAHQKRI.B IlN£an>VBAY -VXaUOaurrVD SSLKFVABXA FIiKHSNLQHI 130 

NFTRNKIiTSIi fiRKHFKHLDL 8EI1ZX1VGNPF TC8CDIHHIK TLQIBAKSSPD TOPIiYCUTOS 180 

SKNtPliTOjlLQ IPtHCXSIiPaAH liAAPHLTVSB (3K5ITLSCSV AGDPVFETMYW DVGSILV8KHK 240 

NET9ETQ(?SI* HITHISSDDS GSQISCUABH LVGEDQDSVH LTVHFAPIIT FIiE9FTSZ3HH 300 

WCIPPTVKGN PBCPAIjQWFYH GAILNEBKYI CTKEHVTHHT EYHGCWUMI PTHMHNGDYT 360 

50 I^XAKHHYGED BKQI2AEPMC3 WPQIDDGAKP NYPDVXYEDY aTAAiaDlGDT THRSHBIPST 420 

DV1I>ICIGREH I^SVYAVWrA SWGFCEiIWH I^FIitrKXiKRHS KFGW»FVZiP HKIPLDG 477 

Seq ID HOs C326 Protein Sequence 
Pcobeln Accession #t SIP_S70843.1 

1 11 21 31 41 51 

I I I I I I 

KFUCHYLLLL VGOOMOAOIi AYHGCPSECI CSRASQVECT (32ffiIVAVPIP liFmSAMdLQI 60 

UmiXTBLSSE SPFliNI&AIiI AI^EKHELS RlTPGAFKEnU OSIiBYliSUItf NKLQrVXiPiait 120 

60 FQGUlSLBSIi UjSSHQLIiQI QPAEFSQCSH LlCWTqT.HfflgH IiBYIETOAFD HItVGLTtKLNL 180 

GKnSI/miSP KVFQHLCan:^ VISIjYEHRLIT DIPHSTFDOL VHLQELALQQ NQX^IiSFGI. 240 

FHmgHNLQRTj YltSNKHrSQli PPSIFHQIiPQ L2]Rr«TLF(aTS IrKEIfSU^rFG PMENIr1lELin» 300 

YDHHISaiiED NVFSNIiSQIiQ VLILfiBSIQIS FI8PGAFHGI< TEL»SLSLIir VmSDJJOGHN 360 

FBML2INLaNX BUasasatUBLOSL FOriFjannS LHiaQLQIINQ ISEn^IfiZED ELGKLCBEiRIi 420 

65 YIHFHRCDGD ILPIiRVIWlltUlk HQPRUSIDTV FVCFSPAKVR OQSLLIISW VWrPSVElVPE 480 

VPSYPSTEWY PDTPSVPDTT SVSSTTEIiTS PVEESYIDLTT IQVTDDRfiVW GNTQAQSGnjA 540 

lAAXVIGIVA IiAC9X4VACVG <!CSCCICKR8QA VLMQfMPHB C S81 

Seq XD laOt C327 Protein sequence 
70 Pzotein AccoBoion #s HP_002«49,1 

1 11 21 31 41 51 

I I I I ! t 

MRAIiEJUtLLI. CVLWSDSK0 ^NELHjQVPSN GDCUSaOTCV SKKYPSSimr CHCPKKFGGp €0 

75 HCEIDKGiCrc TBOnGBFYSG KASTDT^IORP CXiPHH&ATVL QQrTYBIUaE&D ALQLGLGKEN 120 

YCRUPDURRR PHCYVOVGLK PIiVOBCKVBD CADCSKKPSSP PBBLKFQOC3Q S!n»llPRFKII 180 

GGEFTTIEHQ PWFAAZYRRH RGGSVTYVCO OSI>I8PC»VI SATHCFIDYP KKEDYIVYLQ 240 

RSRLUSBTQG BMKFEVENIiI UIKDYSADTIf AHBNDIAUjK ZRSKBGRCAQ PSRTXQTXCL 300 

^ PSMYnDPQFG TSCSZT6FGK EIISTDY3>YFB QUOflWXLl SHRESDQQPBY YIGSBVTTKML 360 

80 CAADPQHKTD SCQGDSGOPL VCBIjQGBMTL TOIVSHQRGC ALKDKPGVYT SVfiHFLPffXR 420 

SHTKBEEIGLA L 431 

Seq ID NO: ca2B Protein Sequence 
Protein Accesslcsn #: XP_087254.1 
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1 11 21 31 41 51 

I I I t I I 

MQFRECSING MKYQEINGRL VPKGPTPDS8 EQNIiSYIjSSL SHIiMNLSHIjT TSSSPRTSPE 60 

NETELIKEHD LPFKAVSIiCa TVQISHVQTD CTGS)6FNQSN lAPSQIiEYYA S3PDBKAIjVH 120 

A&ARIGIVFI GNSBETMSVK TLGKLSRYKL LHXIiBFDSDR BSMGVIVQAP SQBKIJiPAXG 180 

ABSSILPKCX GGEISKTRIH VDBFAIjKGLA TLClAYRKFT SKEVEBIDKR IFEARTWX} 340 

RBEKLAAVPQ ?ZEKDlilIjIi6 ATAVEDRI.QI) KVRETIBALR MAGIKVHVliT GDKHBTAVSV 30D 

SLSCGHFHRT MNIIjELINQK SDSBCAEOLR QUVHRITSDH VIQHGLWDG TSLSLALREH 3 60 

EK£;FMBVCRN C9AVX<CX31MA. PLQKAKVZRIi XKISFEKPIT LAVGDGANDV GMIQEAHVGI 42D 

OIMGKBGRiQA ABNGDYAIAR FKPK.SKLI.FV HGHFYYIRIA TUVOXPSVIW VCFITPQFIiT 4B0 

QFYCLFSQQr LYDSVTXjTIiY NlCFTfiLPIIi lYSDIiSOEVD FHVLQINKFTIi YSDifiKHRLL 540 

giKTFI»YWTX liGFSHAFXPF PGSYLLIQKD 1SLLGHOQMF IBffHTFQTIjVP TVMVITVTVK 600 

MAIjETHFMTW XWH1>VTWQSI IFYFVF8LFY GGIIiWPFICS QJJMyPVi'IQl. I-SSGSAWPAI 660 

IIMWTCLFIi DIIKKVFDRH tHPTSTEKflQ LTBTNAGIKC LDSHCCFPBG EAACASVGaM 720 

IiBRVI^CSP THISRSWSAS tJPPY^CHDRSZ LTLSTMD5ST C 761 

fieq ID NO: C329 Protean Sequence 
Pi:ot:ein Accession ffs XP_08746i.l 

1 11 21 $1 41 51 

I I I I I 1 

MLPUiAALlA AflCPl,PPVRG GAADAPQLLG VPSNASVNAS SAASPSPHGC WPRRPPGPPS 60 

ARAKRRKRRR RHLCHISVQR QMLBGLLWKV QRPRGFQCDI* I.LFSTNAHG3 AFFAAAFHRV 120 

GPPLLXEHLG liAAOGAQQDX* RliCVGCGWVR GSRTGRLRPA AAPSAAAAIA GAPTALPAYP 160 

AABFP(»Lnii QQBPLHFCCIi DPSZiEELQGE PGWRLEilRKFl ESTLVACPHT DVIVVH8VAA 240 

ZilWPVPXinS PIiPHGMSQRR TTASTTAATP AAVPAGTTAA AAAAAAAAAA AVTSC3VATK 299 

Seq ID NO: C330 Protein Sequence 
Pj^oteln Accession ft : XP_051522.2 

1 11 31 31 41 SI 

I I I 11 \ 

MDlJttiPDySK P<3IP3DISWP CMSSDCIWD TVMCPHMRSOC SVUbYTItSFI YIFIFVIGMI 60 

ANSVWHVHI QAKTTGX]>TH CYlIiNIiAIAD LWWI^TIPVW WSLVQHHQH FMCSBUTCKVT 120 

HIiIFSinUFQ SIFFLTCMSV DRYIiSlTYFT NTPfi&RKXMV SRWCZIiVini liAFCVSLKVT IBD 

YWCTVTfiAS USBTYCRSFY PEESIKSHLI GHELVSWLG FAVPF8XIAV FSTPCiLABAIS 240 

ASSDQBKSSS BKZIFSYVW FLVCHLPYHV AVZ^IFSIL EYIPFTdOB HAIiFTALHVT 300 

QCIiSLVHCCtf NSVLYSFZHR NYBYSLKKJ^ IFKYSAKTOIi TfOtlDASRVS ETBTSAiaQS 360 

TK 362 

&eq ID NO I C331 Proneln Seqaeitce 
Protein AGcesBion #: NP_poo34l.l 

1 11 21 31 4X SI 

I I I I I 1 

MGFVSQIQUi tmoniTtAKR QKIRFWBLV WPI.6IiPLVI>I WJBXIKSIPU^e SHECHFPNXA 60 

MFGAGMLPIfL QGIFQ3VNHP CF06PTF6ES FGIV^SSTYKUS IIARVYROFQ HLIMHAPE&Q 120 

HLGRIHTEIfi ILSQFMDTLH THPERIAORQ IRIRDII*KDB ETI^IiPLtKBl IGI>SDSVVyii IBO 

lilH&QVHPEQ FAHGVPDIiAI. BEDIACSBAU* SKFIXFSQRR GAKTVRYAIiC SIiSQQTliQHI 240 

EDTZtlONVDF FlOiFBVIlfFTL XASRSQQINIi R&VG6ILSDH SPRIQBFiaR PGHQPIiIiWVT 300 

RPIMEJNOGPB TFTKIMS8IZ.S DliIXXSYPBGG OSRVIi9FNHY SDCnmCAFU? ID5TBXDFIY '360 

SynRRTTdFC HAblQSItESN PLTKZAHRAA KPLLHSKXIiY TPD5PAARR1 LKBIABISTPEE 420 

LEHVRKXiVKA nBEVGPQIWY FFDMST^^WM IRDTLQNFTV RDFESIRQIjiOE BC3Il7AE}iXLfi7 480 

FZrYIO^PSSSQ ADDHAHFDHR DlFNlTinm. RLVNQYIfECIi VEiOKPeSYND ETQZiTQRKLS 540 

LlaBBHKFnUQ WFEDKYPHT SSTiPPRVKYK XRMDUTWBR rtHBOCXDnnn) SGEEttDPVBD fiOO 

F^yrHOaPAY LQDHVBQGIT RSQVQABAFV GiyiiQQKPYP CFVEDSFMII U9ECFPIFHV 660 

IiAKIYSVSWr V3CSIVLfi9CeL RIiKETZJCEIOG VSNAVIKCm FUiSFSIKSK SIFXjZ/TZFIH 720 

EQRIIiHYSDP FILFLFIiIAF STATlMLCFIj LSTFPSKASIr AAACS(5VIYF TLYI.PHILCF 780 

A»aDR»TAEL KKAVSliLSPV AF6FGT&YI>V SFEEQGLGLQ nSHIGHSPTE 60EF8FLLSM B40 

OlOtLLDftACY SLUOIYUSQV FPGDYGTPLP WYFU^QBSYH LSSE6CSTRE EBALEKXBPIi 900 

TBBTEDPBRP BeZHDSFFER SHPOHVPaVC VXKTLVKIFBP OSRPAVDRUI ITFYENQZTA 960 

FI^CaiNGACKT TTEiSlUTQliL PPTSSTVLVG GRDIBTSLDA VRiQei/MICPQ BBIXIiFHHIiT7 1020 

AEBMIiFYAQli HDSKSQEEAQEi EHBAHIiEDTG t^HKRNEEAQ DIiSa(3MQRKL SVAIAF\R3DA ID 80 

KWILDEPTS GVDPYSRRfil MDLLUCyRSG RTIIMPTHHM DBADBQGDRI AZlAQ6R!>yG 1140 

. SGTPZiFUaiC PeXieLYIiTLV RKHENZOSQ^ IQS8B6TC9CS SKBFfiTTGPA HVDP£«TFBQfV 120 D 

XjDGDVSELHD -VVIiJIIIVFBAX ItVECIGQBIrl FLIiFinQlFlCB lUXABIiFBEXi EBZZ1ADL6LS 1260 

SFGXSDTPIiE BXFLKVTED8 PSOPLFASGA QQKRENV29FR EPCLOPREKA GQTPQDSIIVC 1320 

SPGAPAAHPB GQPPPEPBCf GPQLnTGTQL VLOmrQAIiLV KRFQETIE9E KDFI>AQIVLP 1380 

ATFVFLALMZ* 9XVXI.PFS5Y FALTLHPWIY GQQYTFF8MD EPGSEQFTVX. ADVUUIXPGF X440 

GHRCUBCBSHL PEVPGGNSTP nKTPSVSnHX TQLFQKQKHT OVZiPSPSCRC STRBKUEHLP 1500 

BCPBGAGGLP PPQRTQRSTB ILQDZiTDRHX SnFXATKTYPA X>IRS9IiZSKF HVESBQIRYGGI 1560 

Sl^KItPWP ITGBALVGPL SDUORIMUVS GGPITRKVSK BXEDFLKHLB TEDKXKVWFS 1620 

vnOGMHALVSF UTOAHNAIIiR ASLPKDRSPE EYGITVXSQP LNLTKEQXiSB I^TVLTTSVEA 16B0 

WAXCVXFSH SFVPASFVtiY tkXQERVNKGK HLQFISGVSP TTYHVTElFLn DIHHYSVSAG 1740 

IiWGIFIGFQ KtAYTSPEfilli PALVALLIOiT GHAVXPMHXP A8FLFDV7ST AYVAIkSCKNIi 1800 

PIGX2S39AXT FXLBLFI»niR TLIAEMKVLR ICLI1IVFFHFC liGR^IDLAIi 9QKVTDVYAR 1860 

FGBEH8AHPF HHDLXGKNIiF AMWEGWYF X.LTLIiVQRHF FI>SQBTXAEPT KEPXVDEDDD 1920 

VAEEfiQRlIT GGmCTDIIAb HEI/TKIYLGT S8PAVDRLCW QVRPGSCFGI^ JjGVKOAGKTT 1980 

TFKMLTGDTT VTSGDAOrVAG BCSXInTHXSBV HQNNSYCPQF BAIDEU/TGR SHIiYZtYARUl 2040 

GVPAEBXBKtf ANNSTKSLGL TVYADCLAGT Y9SGNKRK£iS TAIMiIGCPP LVZ^BPTTG 2100 

MDPQARSMLIf UVXVSXIKKG RAWUTSESH EECEALClXa* AIKVKGAFRG MGTZQHI1IC8K 2X60 

FGD6YIVTMK IKSPKODLLP DUflPVBQFPQ GNFPGSVQRS SHmHLQFQV BSSBUOUEQ 2220 

IiXiLCTECDSM* XEBY&VTQTT USQfVFVKFAK QQ^rBBBDItPI. EFRAAGAJBRQ AQD 2273 
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Seq ID NO? C332 Protein Sequence 
Protein Access ion 2!iP_ooe662.2 

5 1 13. 21 31 41 51 

I 1 I f I 1 

MVPmilJUU? ItDVCSBBGLlj ILSVLSVIVG CUjGFFLRTR RbSPQBlSYP QFBGKl^UKSOU 60 

UCMMILPLW 3S1MSG1AS1, DAKTSSRLOV liTVAYYLKTT FKAVIVGIFM VSIIHPGSAA 120 

QKErrEQet3K PIMSSADAIiIi DIiIRNHFPAM LVEATFKQYR TKTTPWKSP KVAPKEAPPR 160 

10 RILIY(3VQEE NQSHVQMPAL DLTPPPEWY KSEPGTSDGH MVLGIVITSA TMGIMLGHMG 240 

DSQAPLVSFC QCZITBSVMKZ VAVAVHYFPF GIVFLIAGKI IJEMDDPRAVS K1CLI3FY5VTV 300 

VCGLVXiHGZiP IDPI^IiYPFXT KKHPIVPIKG IliQAIiIiIAljA TSSSSATIiFZ TPBXSAiEISasnSL 360 

IDRHIAHPVI. PVGATIOT^DG TAtYKAVAAl PIAQVNNYEI. DPHQIlTISI TATAASIGAA 420 

GIPQAGIiVTM ViVliTSVGliP TODITLIIAV DWALDRPRTM IHVLGDALAA GIMMICRKD 480 

15 FASDTGTEIOf IiFCETKPVSI. QBrVAAQQEfG CVKSVABAiSE liTLGPTCPHH VFVQVESDES S40 

IiPAaSLNHCT IQISELEIMV 560 



20, 



50 



■60 



Seq XJ> NOs C33a protein Sequence 
Protein Accession 4|: NPjoosfiao.l 



1 11 21 31 41 SI 

I I I I I 1 

MVTVGNYCEA SGPVGPAnMQ DGLSFCPFFT I»VF8TRMALG TLALVIiaiiPC RRRERPAQAD 60 

SLSWGAGPRI SPYVliQLLIA TLQAai>PIAG IiAGRVOTARG APLPSYIOiLA SVIjESLAiSAC 120 

25 GIAfljiLWERS Q2\RQR]jAHGX MIKFRH6PGL liLLVITVAFAA EHIAIiVStflSiS PQHHWARADL 180 

GQQfVQFSIiWV ULYW^GGLF VLGtiWAPGEfR PQSrrLQVKEC EDQDVERSQV &&2JUQQSTim 240 

DFGRKIiRLLS GYLHPRGSPA IK}IiVVI.t<3£ LMSItERAEiHV {.VPIFntKIV HIiItTEI«PH» 300 

BWWrVTSYV FIiKFliQGGGT GSTGFVSNIiR TFLWIRVQQF TSRRVEIiTF fiaZ^HELSIiRH 360 

HIiGHRTGEVli RIADHGTSSV TGLIiBYLVFN VJPTIiADIII GtlYFSMFPN AWFQLIVFLG 420 

30 HSUXUrUXlW VTEHRTKFRB AHNTQEKATR AaAVI>SLIiNF ETVKYYNA5S YEVBRYRKAI 480 

ZKYOGl^WKS SASLVLIHQI QBLVIGUSLi:. AGSIiIiCAyFV Tfi(^CLGFVGDT VLFGTYIZQEt 540 

VMPUIWFGTY Y3tMIQTHFID MENHFDEiLKB ETEVKDLPGA GPIiRFQKGRZ EFE19VEPSYA €00 

DGRETIiQpVS FTVMPGQTIA nvOPSGAGKS TII>RLIiFRPY DiSfiGCIRID GQDISQPVTQA 660 

SLRSHICJWP QDTVIjPMDTI ADNIRTCGKVT A(3NDBVEAAA QAAGXHDAIH AFPSGYRTQV 720 

35 QB»3LiaiSGG EKQRVAIART lIiKAFGlIliL DEATSAI^S NERAIQA8LA KVCZWRTTlV 780 

VAHRt-SrVVK ADQlIiVIKDG CZVERGRHBA Ui5RGGV»U^ HHOLQQGQBB T6EDT3CPQIM 840 

SR 842 

Seq ID NO; C334 Proteixi sequence 
40 Protein Accession 2aP_000667.1 

1 11 21 31 41 51 

1)1111 

MI>LBTQDALY VALELVlAAL SVAGHVIiVCA AVGTANTliQT PXNYFLVSIiA AADVAVGLFA SO 

45 IPPAITlSlia FCTDPYGCLF LACFVXjVLTQ SSIFGLLAVA VDRYIiAICVP LRYXSLVrGT 120 

SASGVZAVLH VtAPGlGIiTP FLGmSSKDSA THNCTBFUDG nXIESCCUVK CEiFEBWPHS 180 

YHVYFtlFFGC VIiPFIiLIHLV XYXKZFEiVAC RQLQRTELMD HSRTTIiQRBZ BAAXSLAMTV 240 

GIFALCNI.PV BAVKCVTEiFO PAQGKBlKPKff AKimUIdiSH fiSlSWHPrVy ASEHIIDERYT 300 

FHKIZSRYLL CQADVKSGNG QAGVQPALGV^ GL 333 



Seq XD srOs C335 Protein Sequence 
Protein Accession #s SIP 443164 



^^1 11 21 31 41 51 

55 I ) I I I I 

MGI^GABSAHA AIJJ[.GTLQVL AZtI/3AAHESA AHABIIKSHVP SnHTNBTSNS TVKPP TSffAg 60 

DSSUTTVTXU KPTAASMTTT PGHVSlMHrG TTLKSTPKIT SVSQBITSOXS T8TKTVTBBIS 120 

fiVTSAftSSVT ITTYMHSEftK KGSKFXriGSF VOaiVUTIiGV LSILYIGCVM YYSBRGIRYR IBO 

TIDBHDZai 1B9 



8eq ID 190 s C336 Protein Sequence 
Protein Accession ft& ]SP_004186.1 



^^1 11 21 31 41 SI 

65 I 1 I 1 I I 

HA<^OftMSAF BALCSLAIiXiC AIiSLGQRPTG GPGGGPGRLXi LGTQXDAHGC 1EVETTRCCRD 60 

YPGBBCCSEBr DOICVQPBEH GQ)PGCITCa HHPCPFGQG^T Q8Q6KFSFgF QCZXICASGT7 120 

GGGHSCaCKP WTDCT O P gF Ii TVFPGNKTRII AVGVSGGFPA EPIXSWLTWL lAVAACVXiIiIi 180 

„^ TSAQLGCEXW QLRSQCSWPIl ETQLLLEVPP STEPARSCQF PBBBRGERSA EERSRI/SDLW 240 

70 V 241 

Seq ZD ISO I C337 Protein Sequence 
Protein Accession #: BACQ37€7.1 

75 1 11 21 31 41 51 

) I I t I 1 

HOmGRVSGL LBRNLQPTI'T YBffSVPFSFGL CIAFLGPTZiIi I]I2K3aTH&&I> PQXSHVFFSQ 60 

QIiCXJiXiGSAL GGVFKRTLAQ SLHALFTSSL AI9I.VFAVIP PCSDVKVtAS VMAZiAGIAMG 120 

CiDTVAHMftL VRMYQKDSAV PLQVUIFFVG PGALLSPIiIA DPFItSEANCL PANSTKNTTS 180 

80 RGHLFRVSRV liOQHHVDAKP HSHQTFPGliT PKZ3GAGIRVS YAFMIKAIiID LPVPMAVIiHIi 240 

LGKEHI1I.TOC PORRPI'I'I^ ISISiALBTQPP BKEDASSIiPP KFOSHLGHED LFSGCORKNI. 300 

RG3VPYSFFAI HITGfkLVlSW TDQEmiVYSA FVYSYAVEKP LSmSHKVAGY LP8LFWGFIT 360 

rGRLLSIPIS SRHfCPATTWF IM7VGWVT F LVLIiIFSVHV VFIiFVGXASEf QDFZiSSTFPS 420 

HLAYTEDSIO YKBCATTVIiV TGASVGStVX. OHLVGSIFQA QG9YSFI.VCG VXFGCIAFTF 480 
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PCTAJS02/36810 



YIIililiFPHRK KPGLPSVPTQ DRS16MENSE CXQR 514 

Seq ID NO 7 Protein Sequence 

^ fcoteln Accession H^t IIP_0(>2194.1 

1 11 2X 31 41 51 

I 1 I 1 1 I 

MGPHRTSAAP UPIiIiLVIiM>S QGILNOCZAY HVOliF&AKIF 8GP9SEQFGY AVQQFI2aP!Q3 fiO 

MWLLVG6PHS GPPfiNRNODV YKCFVDXi&TA 7SXFNVIEMK THMSLGI>XLT 120 

lU aMMorGGFrw cwplhaqqcg HQyyrrGVCs DispDFOLeA spspatqpcp slidwwcd ibo 

BSNSIYPWDA VKHFLEKFVQ GtJDIQPTKTQ VGIjIOVANKP RWFNliNTYK TKSEHIVATS 240 

QTSOTOGDIrT HTPOAlOlTAR KYAYSAASGG RRSATKVMW VTDGESHDGS HLKAVIDQCN 300 

HDH1IjRF6£A VLGYLNItHAIt DTKmilKBIK AIASIPTE&Y FFKVSDEAJU^ LEKAGTLGEQ 360 

IFSIEGTVQ6 GDNFgHEK&Q VGF&ADY&SQ NDILMEdGAVG AFGH86TIVQ XTSHQiCbXVP 420 

ID KQAFDQIIOP RNHSSYI^S VAA79T(?BST HFVAGftPRAN irrGQXVX>YSV H&HGHITVXO 46D 

AHHGDQIGSY P(3SVT.CSVDV DKDirTDVLI> VGAPMYHSDL KKEEU3RVYLF TIKKQIDGOH 54 0 

QPIiBGPEGIE KTHPGSAIAA I^DIHMDQFM DVIVGSPLEK QNSGAVYIYN GHQGTIRnff €00 

SQKILGSDGA FRSHLQYFGR SUDGYGDIjISIG DSXTDVgiGA FOQWQLifSg SIADVAZBAS «60 

FTPEKlIIiVll KKAQZZLKLC FSAKFRPTKQ HHOVAIVYNI TUWGFfiSR WSRQLFKEN 720 

ZU ItERCLQKHMV VNQAQSCF!ER ZTYIQEPfiDV VNSZ^DLHVDZ SZiENPGTSPA LEAYSETAKV 780 

FSTPPHKDCG EDGLCISDLV IiIWRQIPAAQ BQPFlVBNaff KRljTFSVTIiTt HKHSSAYNTG 840 

IWDFSEHLF PASFSLPVDQ TBVTCQVAAS QKSVACDVGY PALKREQQVT FTINPDFHIjQ &00 

KliQNQASIiSF QALSBSQHSK TKDSUAniLKJ FI«IiYDABIHL TltSTNIHFYS ISSDGHVPSI 9S0 

VU8F&DVGPK FIFSLKVTTG SVPVfiMATVI IBIPQYTKBK MPXiUyLTGVQ TDKM^ISCH 1020 

Z3 ADINPLKIGQ VGSBVSBKSB HSPRBXKEUSIC RTASCSMVTC HZJCDVHMKGS YFVNVTTRin 1080 

HGTFASSTFQ TVQimUVABI XTmiPBIYVI EDHTVTIPUl IMKPBEKABV FTOVZlGeil 1140 

ASXLI.LI.ALV AZLTOOiGPFK RKXB8»TraiP DEIDETTELS 5 llSl 

Seq ID NOs C339 Protein Sequence 
jV Protein Accession #: 1IP_113€48.1 

1 11 21 31 41 51 

I 1 1 ] I I 

MXRPXiARAAP EGBVRGCftVP 5TVI«IiLI<AYl< AYLALGTGVF MTLBORAAQD BeRBFQWBS 60 

- - ELLQftlFTCU) KPALOSLIKD WQfiXKHGAS IiLSHTTSMGR HKLVGSFFFS VSTITrXGSTG 120 

3d BZiSENTMAAR LFCIFFALVQ ZPlMLWUdR LGHLMQQGVH HHASHIiGGTff QDFDKARRUl 180 

GSGAX'I.SGni. I,FL£iZiPPl.IxP SHHB6I99YTH QPYPAFITLS TVGFGDTVXG HHPSQKYPLW 240 

YKHHV&Uill) FGMAHIAItn KXiXL^LElP GQVCSOCHHS SBEDFK&QSH BQGPDREPE3 300 

HSPQQGCYPS GPHGIIQRLS P6AHAAGCGK SS 332 

40 Seq ID BOx C340 Protein Sequence 
Protein Accessian #s hp_o 04145 

1 11 21 31 41 51 

I 1 I I I 1 

43 H KWUMCJJLUUA LOLPFITCVY RBUFKQlAIiP PVYSWVIAAG ItPIjGnCVXTQ iCTSSXtALTR 60 

•EftVYTLNIiAL ADLI^YACSIiP IiItlYHYAQm HWPFGDPACR lA/RPLPYANL EGSII^FI^TCI 120 

SFORYliOICH PIAPWHKRGG RSAHMIiVCTA VWIAVTTQCaj PTAIPAATGI QRHRTVCYDL IBO 

SePALMHYM FyGHALTVIG FLLPPAALIA C?CLIiAC3tLC RQDQFAEFVA QERKSKAARM 240 

AVWAAAFAX SFlf PUXTKT AYLAVRSTFG VFCTVLBAFA AAYK0TRFFA SAHSVLDPII, 300 

DX) FYFTOKKFRR RPHBLLQKI.T AXNQRQ6R 328 

Seq ID NOf C341 Protein Sequence 
Protein Accession #« 1IP_0 09120.1 

55 1 11 21 31 41 SI 

I I I I I I 

KQHPGEHLWL VU?VM5SCAft ISSMDWERPG DGKCQPIEIP MCKDIGYMKT SHPDILHSHEH 60 

QREAAIQLHB EAFIiVBVQCH GHUlFFIiCSL YAFHCTEQVS TPlPACRVMC EQARIiKCfiPl 120 

MBQEUFKMED SLDCRKIjPMK 1IDFNYI.CHEA EBlElGBDEPTR GSSIiFPPIiPR PQRPHSAQEa 180 

OU J PIiKDaaPOKO <»33HP6KPHa VBXSASCAPZi CrPOVDVYHG RSntRFAWH ZAIHAVI.CFF 240 

SfiAPTVLTFL IDPARFRYPS BPIIFI£MCY CVYSVGJnUIR IiPAOaESlAC DRD8GQLYVI 300 

QEGDSSTGCT LVFLVLYYFG MASSIjWWWL TLTWFIAAGK KHGEEATEAH 5SYFHIJ\AWA 360 

IPAVKTIIiII. VMRSVASDEI. T&VCYVGGMD VMAIiTGFVI*! PLACYI>VXGT 8FXI>9QFVAL 420 

^ FHIRRVMKJG GBDlXDKIvEKL HVRXGXiFSVZ. YTVPATCVZA. CXFYERUflMD YMKILAAOHK 480 

m acromoTKn* dcuoasipa vbxphvkifk uvvGiTsaH vtintxegxuQs woavcsRSLK 540 

KXSHRKPAfiV ITSGGIYKKA OHPOKTHHOK YEIPAQSPTC V 581 

Seq ID NO: C342 Protein Sequence 
Protein Accession #b in>_005752.1 

1 11 21 31 41 SI 

I 1 I f I I 

KEVSRRKAFP RPPRPAAPLP IiIAYI^IiAIiftA PGRGADEPVn XSBQAIGAIA A&QSDGVFVA 60 

_^ 9GSCLDQI»Dy SLEHSLSSliY RDQl^GKCTBP VSIAPPARFR PG6SFSKI.I.Ij PYSBQAAGDG 120 

/t> GULLTGHTFD SGAjCBVRPS^ NIiSRKSLPNG TEWGCRPQG STAOWYRAG R£DIIB]7YIAVA 180 

ATYVLPEPET ASRCNPARSD HDTAIALKDT BGRSIATQEL SRLKLCSGAG StHPVDAFLW 240 

MQSIYFPYYP YHYTSGAATG KPSMAHIAQS TEVLFQGQAS LDOGHS^PDG RRLLLSSSliV 300 

EALDVHAGVF SAAAOBOQER R8PTTTAI^ PRM9BIQARA KRVSffDPKTA BSHCKBQ3QP 360 

BRVQPXASfiT XiIHGDLTSVY OrVVMNRIVL FliGTCDGOLL fCVIZWNUrS HCPETIYEIK 420 

oU EBTEVFyX3:>V FDPVXHiyiY ItXAGKBVSSI RVAHCMXHKS C&BCL^TDP HCGHCSSLQR 480 

CTPQGDCVHS EXXhEXmHiie &GAKKCPKIQ IIR8SKEKTT VTMVGSF8PR HSEB7CVKNVD S40 

SSREXiCgmCS QPHRTCTCSI PTRATYKDVB WNVMFSFGS VOTXiSDRPHFT NCS&I.KECPA 600 

CV&iGCAnCK SARRdHPFT ACDP8DYERN QSQCPVAVSK T5G6GRFKEK ICOmTNQALQ 660 

VPYIKSZBPQ nrSXLraCSNV XVTGANFTRA SHinaUOBT taX DK U V lQV BBVLNDTBMK 720 
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F^liPSSSKEM KDVCIQFDGG NCSSVBSLeY lALPHCSLIP PATTWISGOQ KITMMQRIIFD 780 

VIDNLIISHE UKGNINVSBY CVATYCGFLA PSLKSSKVRT NVTVKLRVQD TYLDOGTLOY 840 

REDPRETGYR VBSEVDHELE VXrQKEHDHP ITISKKDIBIT ItFBQENGQIiH CSPEMITHMQ 900 

DLTTILCKIK QIKTASTIAN &SKKVRVXU3 NLELYVEQES VPSTWYFLIV LPVLLVIVIF 960 

AAyGVTRHKS KEI^SRKQSQQ LCIiXiBS^HK BIRDGFAEHjQ MDKLDWDSF GTVPFUSTKE 1020 

FAL.R-TFFFES GGFTHIFTED MHKHDANDKN ESLTALDAIjI CKKSFI.VTVI HTLEKOKKPS 1080 

VKDRCLFASF LTIALQTKIrV YljTSIIiEVljT RIXLMEQCSNM OPKLMIiRRTE SWEKLLTKW 1140 

MSVCSiSGFIjR ETVG&FFYU^ vmiNQKlKK GPVDVITCKfl. IiYTZJtJEDWIiIj MQVPEPSTVA 1200 

IiNWFEKIPB IQESADVCSSfl 6VNVLDCDTI OQAKBKIFQA FIiSKHQSPYG LQLWEIGLBL X260 

Q)«?TRQKELIi DIDSSSVXIiB DGITKMPPIG HTBIHiaQSTX KVFKKIANFT SDVEYSIIDHC 1320 

KLILPDSEAF QDVQGKRHRG KHKFKVKEMY liTKLI^TKUA IHSVLBKLFR SIWSLPNSRA 1380 

PFAIKYPFDP IiDJVOaBHKKI TDPDWHrHK IHSLPLRPWV NIUCHPQFVP DIKKTPHIDG 1440 

CLSVIAOAPM DAFSLTBaQI* C3KEAFTZIKU» yAKDIPTYKE EVKSYYKAIR DLPPIiSSSBH 1500 

EBFLTQB3KK HENEESii&BVA LTEIYinrrVK 'KFDBIIdUQUB SERiCSLBBAQK QLLErVKVLFD 1560 
BKKKCKWM 



Seq ID MO: C343 Protein Sequence 
Protein Acceesion #5 NP__002176.1 



1 11 21 31 41 51 

I I I 1 I I 

MTILGTTFGW VPSLLOWSG ESGYAQNGDI. EDAELDDYSP SCYSQLEVMG SQEgLTCAtJiS 60 

UPUVSTENIj^ PBIOGAIiVEV KCUlFXtKIiiQE lYFlBIKKFL LIGKSHICVK VGBKSLTCiGC 120 

IDLTTIVKPE APFDLSVIYS. BGAWDFWTF HTfiHLQKKYV ICTMHDVAYR QBKDEHKHTH IBO 

VNIiSSTKLTL LQRKLQPAAM YEllCVRerPD HXPKSFNSBH SPSYY^TFB INETSSGEHDP 24.0 

II1LTISIX.SF PSVALLVIIA CVMJKKRIKP IVHPSLPDHK KTLBHLCKICp RKNIJIVSEBP 300 

ESFZJICQIKR. VK3I0AEDBV EGFliQDTFPQ QLEESEKjQiRIj GGDVQSPNCP SEDWVTPES 360 

FGRD8SI/rCL AGNVGACDAP XIiS&SRSUX: SSSGKN6PHV YQDLIrLSIiGT THSlliPPPFS 420 

IiQSaiLTLNP -VS^GQPXIiTS LGSHQBEAYV TH86FYQNQ 459 



S&q^ ID tlO; C344 Protein Sequence 
Protein. Accession #£ HP_002713.1 



1 11 21 31 41 51 

I 1 I I 1 1 

MAAARXfCCiSEi IiUi&ICVALI* WIiI^aAQGA PIiBPV7FQZ3H AIPEQMAQXA ADUmYZNML 60 
roPRYGKRHK EDTUIFSBHG &PHAAVPREI1 SFU>Ia 95 



9eq ID NO: C34S Protein Sequence 
Protein Accession ft* 1IP_115534.1 

1 11 21 31 41 51 

I I t 1 I I 

MTMBHaVRTiIi FTVSIiALQIl NLtairSYOSEK aMGGREBVTK VAT<2XBRQ8F UIHTSSHFGB GO 

VTGSAEGHBP SSPLPYSRAF GBC3ASARPRC CBNGOTCVIiS &FCVCPAHFT GRyOSUDQRR 120 

SEOSALBHSA imiRAi:HI.CR CIFGALHCIiP liQlPDEtGDPK DFIiASHKBfiP SAGGAPSLUti ISO 

LLPCAIaCaRIi LItfiSSAPARPR SI>VP9VI^RB HRPGORPGUS HRL 223 



Seq ID NO: C346 Protein Sequence 
Protein Acceesion #: 1IP_0 06 524.1 

1 * 11 21 31 41 51 

I I 1 i » 1 I 

KMtSUVCU^ XXIiIi&AF&GP GVRSGPHPKIi ADRKLCAI3QB GSHPXBMAVA EdGPYHAEDGR 60 

FUrZHRGQW yVFSKIMR9 BIiFflGGGVQG mTVODLAARL GYFPSSXVRB DQTZrKFGKVD 120 

VKXraOIPFYC Q 3.31 



Seq ID MO: C347 Protein Sequence 
Protein Aeceseion Bos seqo^ice 

X 11 21 31 41 51 

f 1 I 1 < ' 

KTQVTBKSXB HPBKITSTTB XTTRTPEKPT IiYSElCFICfK GKNTPVPGOOP TEHLQ9TTI>T £0 

TBTIKAEVKS TBHPBRIAAV TKTXKPSVKV TGDKBLTTTS 9EUIKfl?BVTH QVPTGSFTLI 120 

TSRTKZiS&IT 8BATGNBSBP YXiNXDGSQKlG IBAGQHGEHD SFPANAIVXV VLVAVTLZXiV 180 

FLGLIFLVSY MHRTKRTI>TQ IflTQynDAEDB GGEKSYPVYI^ HEQQKLGMGQ IFSFR 235 



Seq ID HO] C34e Protein Sequence 
Protein Accession fia 1IP__S43146.1 

1 11 21 31 41 SI. 

I 1 I I I I 

MTQVTBKSrB HPBKTTSTTE KTTRTPBKPT CiYSBKTXCrFK 6KMTFVPBKP tTEKUSHTTI^ 60 
TBTIKAPVKS TBNPEKTAAV TKTI3CP3VKV TfflOKSI>TTT9 SHLBIKTBVTH QVPTOSPTTLI 120 
TSRTKLSSrX BEATQiTESHP YUftKDSSQKG XEAaQ!^4GEND SFPAHAIVrV VLVAVItLLV IBO 
FliGLIFLVSY HMRTRRTIiTQ lilTQYnDAEDB GGPXnSYPVYX> HEQQHbGMOQ XPSPR 235 



Seq ID NDi C349 Protein Sequence 
Protein Accession. #s FGEDESH predicted 



1 11 21 31 41 51 

I I I I I 1 

1363 
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MWPRIAFCCW GLAIATSOWAT FQQHSPSSHF SFRIiFPETAP GAPGSIPAPP AP<3DSAAG3R 60 

VERLOQAFRR RVRLIJUSLSE RIjELVFLVOD SSSVGEVNFR SBLMFVSKLIj SDFPWPTAT 120 

KVAIVTFS8K NYWPRVDYI &TRKARQHKC ALIiLQHIPAX SYRQGOTYTK GAPQQAAQIL 180 

LHAKEHSTKV VFLITDGYSN GGDPRPXAAS LRDSGVEIPT FQIWQGNIRE LNDMASTPKB 240 

5 EHCYLIHBFB BFEiO^ARRAL HEDLPSGSFX QDDMVHCSYL CDE6KDCCDR MGSCKCGTHT 300 

□aFEClCBKG YYaKSLOVEC TACFSGTirKP EGSPGGISSC IPCPDfiHHTS PPGSTSPEDC 360 

VCREGYRAfiG QTCELVHCPA UCPPISNGYFI QNTCNNHFNR AGGVRCHPQF DI/VGS&IILC 420 

LPNGLWSGSE SYCRVRTCPH -LRQPKHGHIS CSTSEMLYKT TCLVACDESr RLEGSDKLTC 480 

QGHSQWDGPB PRCVERHCSr FQMPKDVIIS PHKCQRQPAK FGTICYVfiCai QGPILSGVKE 540 

10 MLRCTTSGKH NVGVQAAVCK 2JVEAPQINCF KDIB&KTLEQ QDSANVTHQl PTAKX&I5GES. 600 

VSVHVHSftFT PPYIiFPlGDV AlvyTATDLS GNQA8CIFEI KVXEABPPVI DHCRSPPFVQ £60 

VSBKVHAAeW DBPQF8DHSG AELVITRSHT QGDLFPQGET IVQYTATDP9 GNNRTCDIHI 720 

VIWSSPCEIP FTPTOGDFIC TPntJTGVMCT LTCLB6YDBT EGSTDKYYC31 YEaDGVMKPTY 780 

TTEWPDCAKK RFANHGPKSP 3BMFYKAARCD DTDLMKKFSfi AFETTLGKMV PSPCSDAEDI 840 

15 DCRIiEENLTK KYCLEYHYDY ENGFAIGPGG W6AAHRLDY8 ySDFX^DTVQE TATfilGMAKS 900 

5R1SR&APL8 DYKIKLIFNI TASVPI.FDER MDTZiBWBNQQ RLLQTIiKTIT NKIiKRTSjHKD 960 

fHYSFOIASE I&ZADSNSZiE TKKASPFCRP GSVIiRGRKCV HCPLOTYyill. EQFIKSSCRZ 1020 

GSYWEEGQL ECKLCPSGMY TSYIHSHHIS DCKAQCKQQT YSYSGLETCB SCPLGXYQPK 1080 

FGSRSCL9CP EMTSTVKaGA VNISACGVPC PEGKFSRSGI» HPCHPCPRDY YQPKAGKAFC 1X40 

20 EACPFZGTTP FAGSRSITEC STSVIiKIITXF GGPGHIiELUT CPSEVFHEICF FNPCHN8GTC 1200 

QQLGiRGyVGE. CPUSFjrTGbKC EIDIDEC&PI* PCLNIilGVCKD LVGBFICBCF SGYTGORCBB 12fiO 

NI»EC&SSPC LISKGICVDGV AGYRCTCVKO FVGLHCBTEV 17BCQ8lffPCLH HAVCBDQVOG 1330 

FLCKCP1?GF]:« GTRGGKNVDE {:!I^QPCICI]GA TCKDGAN9FR CLCAAGFTGS HCELNIHECQ 1380 

SMPOIHQATC VDBLNSYSCK OQPGPSOKRC ETEQSTOFNIj DFEVSGIYGY "VMLDGMLPSL 1440 

25 HAUTCrrFWMK S5DDMHYGIF ISYAVBblGSD NTLLLTDYNG WVX.YVNGREK llZiCPGVHDG 1500 

RttaBiATmr ejuaGiKKvyz dgklsdggag i.gva£iPiFGs galvlgqbod kkgeofspab isco 

SFVOSTSQW LHDYVI.SPQQ VKBIiAT8GP£ ELSSOGSVUM PDFIjSGXVGK VKXDSKSIFC 1620 

SDCPRLGGSV PHtRTASEDI. KPGSKVHIiFC DPGFQiLVQIP VQYCLNQGQW TQPLPHCERl 1680 

SCXJVPPPIiHN QFHSADDFYA GSTVTYQCWN GYYX.I*QD^RM FCTDSGSWNG VePSCUJVDE 1740 

30 CAVG8DCSEH ASCIVHVDGSY ICSCVFFYTG DGKNCAEPIK CKAPGtiPEHG RSSGEXYTVG 1600 

AjOVrFSCQBG ICQLIIGVTKIT CLE&GEBIMHIi XPYC3CAVSCX3 KSAIPSNGCI SBLAFTFGSK IBfiD 

VTYRCNXGYT rAGDKE9SCI> AH33H&H&PP VCSFVKC88P ENXHNGKYZI. SGLTYtiSTAfi 1920 

YSCDTOYSliQ QPSIIECTAfi GIWDRAPPAC HLVFCGEPPA IKDAVITQMN FTFRNTVTYT 1980 

CKBGYTLAGL DTIECLADGK KSRSDQQCLA VSCDEPPIVD HASPETAHRIi FGEIXAFYYCS 2040 

35 DGYSLRDHSQ IiXiC»AjQGKMV PPEGQDMPRC IAHFCEKPP9 VSYdlljBSVS KAKFAAGSW 2100 

SFfQCMBGFVL MTSAKZBCHR GGQjHHPSFMS IQCXPVRCGE PPSZMHGTA3 GSHYSFOAKV 2160 

AYSOIKGPyX KGEKKSTCEA TGQWSSPIPT GHPVSCGBPP KVEta^tiEHT !EGRIFESKVR 2220 

YQCNPGYK6V 68PVFVCQAN RHHH3ESPLM CVPLDCGKPP PIQNGFMKGE SIFEVGSKVQF 2260 

FCHBGYBLVG DSSMTCQKBG KHNKKSNPKC MPAKCPEPPI. DENQLVIJCEL TTEVGWTFS 2340 

40 CKE6EIVLQ6P SVLKCLPSQQ WHDgPPVCKX VliCTPPPLIS F6VPIPSSAL HFGSTVKYfiC 2400 

VGQFFUU^ TTLCQPDGTH 86PLPSGVPV BGFQPBBIPtT OIZDVQailiAY LSTAIiYTCSCP 2460 

GFEtiVGISTTT LGGBSGHMI^ CSEOPTCKAIEC IiKPKBII^IGK FSYTDLKYGQ TVTYSaSBGF 2520 

mSGfPSMJTC IiBTGDWDVZ3A P&CMAIHCDS PQPIENGFVE: OADYSYGAXI IYSCFFGFQV 2S80 

AC31AMQTCBB fiGHSfiSIPTC MPIDOGI^PH XBFISICTKLK tKDQGYFBQED DMMEVPYVTP 2640 

45 ^PYHLGAVA KTWEKTKBSP AlHSSNFliYG TMV8YTCNP6 YELLGNFVI*! OQEDOTHNOS 2700 

AP5CZ&IBCD IiPTAPEHGFIi RFTBT8HGSA VOYSCKFORJ XAOSDIAbCZi SHRKHSC&SP 2760 

RCSAISCRKP VSVKSKSaiKa SHYTYDSTLY YECDPGYVLN OTEK|tT0QDD KHnDBDEFJC 2820 

IPVDCG&PFV 8AMGQfVH<roB YTFQKBIEYT CSIE0FLLEGA ReaVCLAHCS WSGATPDCVP 2880 

VRCATPPQLA KGVTEGLDYQ FMKBVTFHCH BGYILH6APK LTCQSD^TOD AEIPLGKPVK 2340 

50 OGPPSDLASG PEHGFSPIHG GEIQYQCFPG YKUBXmSSBB. CLSUGfiWSGS 8PSCLFCRCS 3000 

TPVXEY(?rVN OTDSUCGEKAA RIQCFKGFKIj liSL&BITCBA JOGOKSSGFTE CBHTSGSSXiP 3060 

HI»IAFI8BT SSHKBSIVITY 8CK9GYVXQG SSEHilCSTBlOS VHSQPYPVCB PL80GSPPSV 3120 

JOffAVATGEAH TYESEVKL&C LSGYTMDTDT DTFTOQKDGR KFFERISCSP KKCPIiPBEIIT 3180 

HILVHGDDFS VKRQVSVSCA ESYTFEGVNI SVOQLDGTWB PPFSDESCSP VSOSKPBSPE 3240 

55 HGFVVSSKYT FESTXZYQCB P6YBLEGNRB RVOQEHRQHS GGVAZCKBTR CBTPLEFLSIG 3300 

KKDIENRTTG FNVVY8CNB0 YSLEGPSBAH CSBHGTNSHP VPLCKPHPCP VPFVIPENAI. 3260 
LSBKBF1CVDQ NVSIKCHEGF Iil^GBGIITC NFDBTKTQTS AKCE9CXSOGP PAHVBHAlAR 3420 
GVHYQYCaSMI TYSCYSGYKL EGFXASVCLB NGTttX&PPXC SAVCfiFPOQH GGIOQRPmC 34 SO 
SCPEGHKGRXr CEEPXCIIiPC LlilGGRCVAFY QCDCPPONTGI SRCEiZAVODS FCUIGGKCVR 3540 
OO PSIRCECLSSH TGHI7CSR 3557 

S&i ID NO: C3SO ProteiiL Sequence 
Protein Accession #s F6ENE9K predicted 

65 1 11 21 31 41 SI 

1 I 1 i t I 

MRF8V8GMRT DYPRSVIAPA YVSVdiliZiLC PREVIAPAGS EPMLGQPAPR GSnaYigPCB 60 
QCXJYHDAIVS LSETBQOGPP CIFBIPCFELC CLDSFOiTHD FWJOjKVQCIV KSQCH&SPIS 120 
SKCERGRXC 129 

fieq ID NOi C351 Protetxi Sequenca 
Pxotein Accession AAB35G71.1 

„^ 1 11 21 31 41 SI 

75 I i I I 1 1 

MVPOAItGGGA XiARAA6EtGIiX> AUjIAVSAPXi RXdQAEBIiGDG CXSHLVTYQDS 6THTSKHYFG 60 
TYPEIHTVCEK TXTVPKGKKli IIiBliGDLDIB SQTCASDYLIt FTSSSDQYGP YCGSMTVPKB 120 
LLLHTSBVTV EUPBSGSHISG RGFliXiTYASS DHFDI*XTCLfi RABHYLKTEY 8KFCPAGC3U} 180 
VAGDI90NMV DGYRDTSLEjC XAAIHAGXIA DEI.OGQX8VX* QEiKGX3RySQ XXJ^nCVLSRD 240 
OU G&L8DKRFLF T6NGCSRSI*S FHPDC3QIRA9 SSWQ9VIfSSG DQVHHSPGQA HE^QPOGPSHA 300 
SGDSSHHHKP SEHLSXDLGE KKKITGIRTT GSTQSHFNFY VKSFVMHFKN NbTSKHKTYKG 360 
IVNBBBfiCVFQ Gti^SNFRDFVQ KbTFrFPIVAR YVRWPQTHH QRIAIiKVBLI GCQITQGNDS 420 
IiVWRKTSQST SVSTKKSDET ITRP1P8EBT 8TGXNITTVA XFLVIiXiWIfV FAGH6IFAAF 480 
HKKKKKGSPy GSABAQKTDC HKQIKYPFAR HQSABFTI&Y IINEKEnTQKI» DZilTSnHAG 539 

1364 
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Seq ID NO: C352 Protein SeQuence 
^ Protein Accession ff: Eos sequence 

1 IL 21 31 41 51 

1 I 1 I I I 

^TGFQAGaSIJR PVPAPR5SAE EAARPGQIiKL BIRRGEABLA KLAP5GVMVP GARGGGftltAR 60 

AAGRGLIALL LAVSAPLRLQ AEELGDC5CGH LVTYQDSGTM TSKNYPGTYP KHTVCEaCTIT 120 

10 VPKGKRLILR LGDIJDIBSQT CASDYl^LPTS 8SDQYQPY0G SMTVPKEUi MTSEVTVHFE 180 

SQSHISGRGF lA/EYASGDRP DI»ITCI2RAS mfLKTEySKF CFAQCHDVAG DISGHHVDGnr 240 

RDT8LLCKAA ZHMSIISDEZi GQQISVUQRK. GISKYBSIIA. 1SIGVLSRD133U SDKBFI*'PTEK 300 

GCSR5I»SFEP DOQISA6SSW QSVEIBStTDQV HHSPGOASLQ DQGPSHASCSD fieHNHKPRBH 360 

IiSrDliGEKKK ITGIRTTGST QSNFNFYVKS FVMNFKNMNG KWKTYKGrVW WEEKVFQGMfi 420 

15 NFBDPVQHHF IPPIVARYVR WFQTHHQRI A£>KVSLIGGQ ITQ6HDSI.VH KKI6QSTSVS 480 

TKKEDBTITR PXP3EETSTB ZNITWAIPL VIiCWLVFaiS 1QGZFA21FSKK KKBBSFYG&A 540 

SAQKTCCHKSQ ZK^EAttHQB AEFTISTOHB KBHTQKUSLI TSTOW} 586 

Seq ID NO; C353 Protein Sequence 
20 Protein Accession H: fgbheSh predicted 



1 11 21 31 41 51 

I I I I t I 

MFQHQBaPIjD IjSSAEAVAAW IliHCJHPDIIN KGDGOGHLVT YODSGTMTSK WVPQTYPNHT 60 

25 VCEKTITVPK GKRLILRLGD LDIBSOTCAS DYt^LFTSSSD OYGNQKEEETT EVLCLSVAQA. 120 

QRVDIPVQLI. PSFEiBGHH^ ADAOGPYCGS HTVPKBUjIM TSEVTVRFES QSai.&QBaSPL, 100 

LTTTVSSDHSD IrITCIiERA8B YLKTEITSKFC PAGCRDVftGD XSGHHVDGYR CTSLI^CiaJlI 240 

BA.<3IIADSL6 GQISVLQRKG ISAYBGJhPN ^VtiSRDGSLS DKRFLFTSMa CSRSL&FBPD 300 

GQXRASSSHQ &VI!)ESGDQVH HSPQQA&liQD QGPSWASGDS 8NIIHKPK£t9L EIDIiGEKKKI 360 

30 1X3IRTTX38TQ SHEnFYVKSF VMNFKUNNSK HKTYKaiVHef EEKVFQGn^ FRDPVQNliFI 420 

PPZVASyVRV VPQTHHQXIIA UCVELIGCQI TQGHDSLVIIR KTSC2ST&V8T KKEDBIMIRP 4B0 

IPSBET8TPA MFVQIVGDHT QKISQRBHIiQ PDBSSZFFKG TAESHVRWF AWVHDLGHIi 540 

FIAHTPEEDI DHYCHKQIKY PFARBQSABF TX8!eDNEKEN TQKIAE.IT&D MADYQaPLMI 600 

GTGTVTRKGS TFRPMDTDRE EftGVSTDAGG HYDCPQMflR HEYALPIAPP EPEYATPIVE 660 

35 BKULIUUrrFS AQ&GYRVPGP QPGHKH9IjS& GGFSPVASVG AQDaDVQRPH SAQPADRGYD 720 

X»KAV6ALAT BSGHSDSQKF FTHEGTBDSY SAPSDCSiTPXi MQTAMTALI] 769 

Seq ID NO: C354 Pzoteiti Seqpaei^ce 
Protein Accession tts HP oa4fiD7.i 

40 

1 11 21 31 41 51 

1 1 I 1 I I 

HACVSACXKT 5HFTF1?FLFW liGGIIiIIiAIA IWVRVfilsIDSQ AIFGSEDVQ5 SSYVAVDIIiI 60 

AVOAlIHIIiG FLGCXX».ZKE SRCMLIiIiFFI GILLLIUiIiQV ATGIDGAVFK SKSDRIVHEl? 120 

45 IiYSNTECLIiSA TGESHHQFQEI AIIVFQEEFK COQLVHGAfiD WGHMFQHYPS liCACUTKQRP 180 

CQSXNtaKlQvy KBTCX8FIXD FIAXNXiIIVI GISCGLAVIE II^VFGHVL YOQIGISK 237 



9eq ID HOs C35S Protein £leqiience 
^_ Protein Acceaeian #s KP 004608.1 

50 

1 11 21 

I I I 

MCIGGCARCL OGTLIFIAFF GFLANIIiIiFF 
PPALVFLGLK NNDCCGCTON EGCGKRFAMF 
55 UnUSTHOYF FBDGDYIiNDE AlMHKCRBPIi 
GUiGTLGGDC QOCXSOCOiaDG PV 



31 41 51 

1 1 t 

PGGKVinDND HL9QBmFFO 6ILGSGVXMI 60 

TSTIPAWQF LGASYSFXIS AISIHKGPJCC 120 

nwpnNLTLF STwrvaaiQ mvlczuqwn ibo 

202 



60 



75 



8eq ID not C356 Protein Sequence 
Protein Accession #: NPJD02372.1 



1 11 31 31 41 51 

I I I t 1 I 

MPRPAPAHRL FGLLDIJMPZi IiUiPGAAPDP VARPGFRRLE TBGPGGSPGR RPSPAAP0GA. 60 

^ PASGTSEPGR ARGAGVCKSR PIiDLWIIDS SRSVRPLEFT KVKTFVSRl I DTLDIGPADT 120 

65 RVAWSYA6T VKXEFQIKIAY -CDKQSIiKQAV ORITPLSIGT M5GLAIQXAH DE2UrrVBAGft 180 

. SSP83HZPRV AIIVTDGRPQ CGfVBIEVAARA OASGIBLYAV GVDRiUMftSL mWSBBUBB 240 

EVFYVETYGV IBKLSSRFQB TFCAU>PCVXi GIEQCQjHVCI BDGBQKHiXCB CSQ@YTLI!IAD 30 D 

IOCTCSMjDRC ACNTHGCEHX CVUDRSQSYH CBCyEGYTUV EDSKTCSAQD KCALQTHGCQ 360 

HICVMDRTGS HHCECXEGYT liEUMOCTCSV ISDKCAliCEHG OQHICVgDOA. ASYHCDClfPO 420 

70 YTLEIEDKKrC SAXBBARRIrV STEDAGGCRA. TiAFQDKySS YXiQRUn^KLD DILEKUCINB 480 

TGQIHR 486 



Seq ID NO: C357 Protein Sequence 
Protein Accession #: NP_057723.l 



1 11 21 31 41 51 

I I I I I 1 

MARGSLHHLL RIAVUGLWIA EUSVAOBQA PG1APC6RG8 SHSftDLXJKCM DCAfiCRARPH 60 

SDFCIiGCAAA PPAPFBI.IMP IIKaSAIiSLTP VLGLIiSGFliV ITRRCSBREKF TTPIEETG6E 120 

80 GCPAVALZQ 125 



Seq ID ]9D% C358 Protein Sequence 
Protein Accession #: i!iP_ooi8io«l 
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1 11 21 31 41 SI 

I I i I 1 1 

MQPTLLLSLL GAVGIAAVNS MPVE9SIRNHNE GMVTRCIIEV LSNALSKSSA PPITPEC31QV 60 

IiKTSRKDVKD KETTENENTK FSVRliDRDPA DASEAHESSS RGKftGAPGEE DIQGPTKftDT X20 

5 EKHABGGGfiS RSRADSPQHS LYPSOSGfVSE BVKTRESEKS QREDEEBEEG EEIYQKGEROE 180 

DSSEBKBIiEE PCGTOHAFEdZ ERKOZ^XKK EELVARSETB AACSISQBKTH SRSKSSQESG 240 

SBAG5QBNEF QESKGQPRSQ EESEEGBEDA TSEVDKRRTR PfOmilGRSRP 300 

SSBSCGHPQEB 6EBSNVSMAS U3EKRDHHST HYRABEEEPE YGEEUKOYFG VQAPEDI£HE 360 

^ RyRGRGSEEY RAPRPQSEES WDEEDKRNYP SLELDKMAHQ YQEBSEEERG l^EPQKGHHHR 420 

10 GRGGBPRAYF MSDTREBKRF LGBGHHKVQB NQHDKARSHP Yl^GBBGAP 480 

GKffQQQGDLQ DTKEHREEZUl FODKOYSSMH TAEK&KRICT ZiFHFYYDPLQ KKSSHFEKSD S40 

NMBIEHFIiBSE EEN£XiTLI9EK nFFPEYNyim HBKXPFaEDV imGYSKRHIA RVPKIiDIjKRQ 600 

YDBVAQUX2L LHYKKKSAEF FDFVDSEBPV 8IEQEAEHEK DRADQTVLTE: DEKKELEEXLA 660 

AHDIiSLgKIA EKF&Q'liG 677 



15 



35 



45 



55 



65 



seq ID NOs C359 Protein Sequence 
Protein Accession #« xp_ti 93082,1 



1 11 21 31 41 51 

20 1 I I I I ) 

VBSULCBGUSQ PNCVLQTIiRW YRCCrlSSASC! GALhAVLGTS QHLTEIiBFSE TKLEA&ALKL 60 

I.YGGLKDP»C KLQKLSTLQFS IfBVTAAXLPV CHVGirCSQFB QSLVQSHFGY CQDSSFKCDIi 120 

CKIiLftPSTRV AAAKDCG9PK SFLSBGLHWA 6RLBAVBEVX* GLGVI.VQP<3D PA6QGGSHCE 180 

NYGSFRDL.VD liEVKAHPSLK KGC3MDLQRPT LQVUILCKZF Sl^KLFIiPIAIi EMgPQQVSW 240 

25 QVTIPD6FVN VTVOSNVTl^l CIYTrxVASR BQLSIQttSFF RKKENBFIfiS PREEGKWPI7V 300 

EAVJQQTU^QQ QAELQIY7SQ GGQKVATGQF XDRITQSHDP GHASITIBBH QPADSOIYIC 360 

CiVElNPPDFLG QSigGII^filVSV XiVXPSKPLCS VQ0RPBIGET ISLSCLSAliG TPSFVYYNHK 420 

LEQHDIVFVK SMFNPTTGIIi VIGHIJTHFEa QYYQCTAlNR LtaiSSCEIDL TSSHPEVGII 480 

VGALIG3LV5 AAIIJSWC3? ASMKAKAKAK ERNSKTIAEL fePHTKIOTRG ESEAMPREEA S40 

30 TQiliBVTLPSS IHET6PDTIQ BPDYEFKFIQ EPAPEPAPGS EPM2VVp[lLDI ELELEPETQS 600 

ELBFBPEP^ BSBFGVWEP IiSSDEKGWK A 631 



&eq ID HQs C360 Protein Sequence 
PxDteln AccesBlon #: FGEEIE51I predicted 



1 11 31 31 41 51 

) I I I I I 

HVFAPWKVFI. 1LSCIAGQV3 WOVTIPDCSP VKVTVGSHVT LICIYTTTVA SHBQI.SIQWS 60 

PFHKKEMEPI SSPWEEBKMP DVBAVKGTLD GQQAELQIYF 9QOTQAVAK5 QFKDHTTGSW 120 

40 DPCaSASXTlS RMQPADSGIY ICDVHMPPDP UaQHaOlUiV SVLVKPSKPIi CSVQGRPEirG 180 

RrXSLSCXtSA LGTPSPVYYW HKUBGRDIVP VKEHSISPTTG IIiVJOHUrHP BOGVYQCTAI 240 

VtOjXSSISSCBX DUrSSHPEVG irVGALXOaL VOAailZSW CFARHKMOUC AXSSNSKTlA 300 

BLSFMTKIHP RGBSEAKPRE WSQLE^hV &8IHBT6PDT TQEPDYEPKP TQEPAFEPAP 360 

GSBFNKVPDIi DZBLKE£PBT QSELBPEP8P EPBSSPGVW EPiLBEDERGV VKA A13 



8eq ID MOe C36X Protein Sequence 
ProteiA Aeceaslon #s NP 003011.1 



_^ 1 11 21 31 41 51 

50 I 1 1 I ! \ 

HVSBHVerTMIi eSLLFWZASG HTPAFAYSPS TPDRVSEfiDI QEtLUUfSVKEQ IiGIAHPRVSY 60 

PABQAMiaiATa PQ31EGGAHB 'GLQHLGFFGII IFNIVABUXO DNIPKDFSED Q GYBDP PHPC 120 

FV6RXDDGGL SNTPDXZffiFS REFQIiHQKXf BPERDYPGLG KNHKXEiIiYSK KKGGrERHKKR IBO 

SVHPYLOGOR UMWAKKSV E!EFSDBDKDP B 211 



Seq ID HOs C362 Protein Sequence 
Protein Accession #s HP 076926.2 



1 11 21 31 41 SI 

60 I I I I I I 

MtrTHOGHEQA VBW^BGVVQ LGNMAVIRSH IMKGLQEBF& KOBFKVliGW QILTAIJtSIiS 60 

KQITMHCMftB NTXGfiNFISV YXOYTmOSV MFIZSGSL&I AAGIRTTKQtL VSJOSUtSmZT 120 

SSVZAASGZZ> INTPgiAFYS TOIPYCIIYYG llSEIIICECGTf4S ILHSIjDGHVL LIiSVIiEFCZA 180 

VfiL8ASGCK7 LCCTPGGWXi IIiPSESDOMAB lASFTPUnEV 320 



Seq ID NOt C363 Protein Sequence 
Protein AccesBlon «s MP 002082.1 



„^ 1 11 21 31 41 51 

70 1 I I I I 1 

HRGSBIiPIiVIi LAIiTItCLAFR ORAVFLnuaG GTVLIKMYFR GNHNAVGBLM CXKBUGBSSS GO 
VSEHGaUDQQ LRBYIRHBBA ABNLLGLXBA KBHRHHOPPO PKALGHQQPS VD3EDQSSFK 120 
DVGfiKSKVGR IjSAPG9Q&EG StOPOLHOQ 148 

75 Seq ID HO: C364 Protein Sequence 
Protein Acceeslon #1 HP_p3e393.1 

1 11 21 31 41 51 

cn 1 1 I I I 1 

OU MDLQGROVPS ZDRUZVLLML FRTMAQIMAE QEVEIiIiSOI>5 ^EHPEKDIFW RHHOTTCLHA 60 

BFAAKFIVPy DVKA5NYVDi;> ITBQADIAI.T RGAEVKGROG SSQSBLQVfn VDRAYAUCMC. 120 

FVKESHNK9K 6FEAIHRLSIC VQFVYDS8SK THFKDAVSAQ XBTtVHSESlUS AIiVTFAOKSY IBO 

BGQAQQTISL A8SDPQKTVT MZi*SAVHIQP FDII8DFVFS SEHKCPVDBIft EQI.E3STLPL1 240 

LOLIIiGEiVlM VTLAIYHVHa KHTANQVQIP SDRSOfYKHMG 280 
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Seq ID NO; Ca65 Protein Sequence 
Protein Accession lilp_0032i7 . 1 

5 1 11 21 31 41 SI 

I I 1 I I I 

MLQliVXiAI^XiS S&SAESYVGL SASSQCAVPAK DHVOCGYPHV TFECEC3IHRGC CFDSRIPGVP €0 

WCPKPLTRKT BCTP 74 

10 ID NO: C3«6 Protein Sequenctit 

Protein Accession ft 7 1IF_002984.1 

1 11 21 31 41 51 

ID MSIiPSSRAAR VPGPSQSIiCA I>IJaJdIiLLTP PGPIASftGPV SKVIiTEEACT CLRVTUmiP 60 

KTIQKLQVFP AGPQCBKVBV VASLKNGKQV CXiDPBAPFLK KVIQKIU^SO HKKH 114 

5eq ID HO: C367 Protein Sequence 
Protein Accession #s HP_00S233.2 

1 11 21 31 41 SI 

I I I 1 I I 

MRSPSAAMLIi GAAIIiIAASIi SCSQTIQGIN KSSKGRSLXG KVDQTSmrTG KGVTVETVF3 60 

mnSFSASVUT GKLTTVFIjPX VyrZVFWGt. PSHGHAliHVP LFRTXKKHPA VraraNUaA 120 

23 DLItBVIMFPL KIAYHZHANN niVCSEALCNV LXCFFYGHHy C9XLF»rCLS VQRYWIVNP 190 

MOHgRKKANI AIGIBIAIWL LILLVTIPLY WKQTIFIPA LNITTCHDVL PBQIiVGDMF 240 

NYFLSLAIGV FLFPAFLTAS AYVLMIRMLR SSAMDENSBK KRKHAIKI^IV TVX«AMYI.rCF 300 

TFSHULLWH YFLIK8QGQS HVYAIiYJVAI. CbSTLHSClD PFVYYPVSHD FRDHAKHALD 360 

GRSVRTVnOH QVSI/CfilEKHS RXSS8Y8S88 TTVKTSY 397 

Seii ID HQi C3ti8 Protein Sequence 
Protein Accession #s HP_00346o.l 

1 11 31 31 41 51 

35 I I I i I I 

MAEAKXUHDG AAL8LIPLIF I>I5GABAASF QIZNOULQK&P DLBliEISIVrQKF P6P&MIEtM.E 60 

YIENLRQQA3 KEE5SPDTHP YQGVfiVPLQQ KESTOXBSHIiP ERDSLSEEXm MRIIIiEAUSQ 120 

AEHEPQSAPK BHKPYALNSB KtTFPHnHSDD YBTQQHPBRK IiKEHQFPPKY EENSRXlUPFK 180 

SXHE^IVSBQY TPQ8ZiATI.ES VFQSLGKLTO FNNQKSHSMD EBQKLYTDDB DDiyKAHKIA 240 

4U YEDWOGEDH SIPVBSKIESQ TQBEVKD8KB mOKHEQmD EMKRSGQLOl QEBDIiEaCEGK 300 

CQLSDDVSBV JAYIiKRLVBIA AGSGRLQEXS^^ lIGERATRItPB KPLDSQSIYQ LIEISRHLQI 360 

PPQDLrEMIrK TGEaCPUGSVE PSRELQliPVP laDDISBAlTIiD HPDLFQNRMI> 9KSGYPKTP6 420 

SAGTEAZ.PDG [iSVEDIUIIilj OMSSAANQKT SYFPSPYXUQB KVI>&SIJ>YGA GRSRS2«QIjPK 4^0 

ASHXPBVENR QMAYEBIUtDK SQBLGEVLAR I4LVKYPBIZN fiHQVKSVEGQ QSraSDLQBE 540 

4D BQIBO&IXEH XjNQGSSQETD KLAPVSKSFP VGPPKMUDrP HBOYWDBDCIi HKV1.BYU9QB 800 

KAEKGRESXA ICRAHEISM 617 

8ea ID NO: C369 Protein Sequence 
protein Accession NP_112217.1 

1 11 aa 31 41 51 

I I I I 1 I 

MPCAOSSWIA nLSWAQLUff PQRIiCirGRQP QP6PVBFFDR EQEBPIKaDP BYHWGPVKV 60 

DASGHFIiSXO ZJE»PIT88RR KRDLDGSEDH VmuSOEBK BLFENLTVUQ GFLSHSXIMB 120 

J J lanrGNXjSBVK HHASSAPZiCH IiSGTVLOOGT RVGTAALSAC BGriTOPPQIiP HGDFFIBFVK l&O 

KHPIfVBGGYB PHIVYTtRQKV^ PETKEPTOGXi KDSVHXSQKQ ELHREKRERH KU?8RSr£RR 240 

8I&KERHV&T bWADTKKJS YSGSEHVESY II.TIMBIKVTG IiFHSIPSIGNA IHIVVVRIjII> 300 

liBBBEQaiiKl VBSASKTZkBS FCK$9QKSIliP KSDI^HPVHBD "VAVIiLTRKDI CAGFIISPCBT 360 

„ IfiLSHLSGHC QPSKSCansaB DSGXiPZAFTX AHBUSHBFGI QEDaSBNDCB PVGRHPYXI4S 420 

OU lIQLQYDiPTPl* THSKC8EEYI TRPIiDRGHGF dCDIPKKKG LKSKVXAPGV :CYEFVHHQGQI» 480 

QYGPISATFCO HWKNWUQjTIiW CSVKGFCfiSK LDAAADOTQC GBBaCHOffiCK CITVGKKPES 540 

XPGGWGaHSP W8HC8RTCGA 6VQ8MSRU3I IZPEPKPGGIQr CTGERKRVSIi CNVHPC&SBA 600 

PTPRQ*p caB FDTVPYKNESj YRHFPIFKPA EPCEIiVCSPZ DGQFS^CMLD AVXDGTPCPE 660 

GGIS&IIIIVCIH GICKHV6CDT BZDSKKIBrat C6VCZ£DGS8 OQTVRIEHPKQ KEGfiGYVDXG 720 

03 liXPtQQARDIR VHBIESQAGNF ZiAZRfiBDPEK YYXpNGGFrxQ HMGEnrKLAGT VPQ3rDRKaD£< 780 

. EKMKPGPTM ESVWIQriI,Fa VTMPGIKYEV TIQKDGLDND VBQ*«PWa»S HWUECSVTCG B40 

TGIPRQTAUC IKKlSRGHVKA TPCDPBTQPN GRQKKCQBKA. CPPRWVIAGEW EACSATOGPH 900 

GSKKRTVIiCI QTMV8DBQKL PPTDOQHZ'&K PK3!£JiGCKBD HjCPSDHTVO SWfiBCBVSOG 960 

— ^ GSVRXR8VTC AXHHDBPCDV TaXfiSSBALC GLQQCPSSRR VXJEPmQGTIS HGKHPIPTUEP 1020 

70 VPPPTSRPBH I»TTPTGPS6M ST8TPAXS8P SPTTASKBGD LQGKQMQD88 TQFBZ.SSKZ1j 1060 

ISIGST8QPI LTSQSLSIQP 9EBNVSSSDT GPTGB6GLVA TTTSG9GI*S6 SKHPITirPVT 1140 

PPmTI.TKGP EMEIU&GSGS BREQPBDKDE SKFVIHTKIR VBGSDKBVE3 TEMP£AFPI,T 1200 

PDL8BBSnHP PPSTVHBGLI. PSQRPT3&BX PRVE6KVT EKPAHTLIiPIi GGDHQPEPSG 1260 

^ KTAHBUKI^ PISNHDICZTKSS EPVLTEEDAT S£>TTEBFhUa AfiHYKQIrTNG HGSAHHXVGBI 1320 

7 J WSBCSXTGGL GASfWKRVECT TQMDSDCftAl QR9DPAKRCH IKPCABNKVG HHSKCfiEOgCS 13D0 

GQFiaREXQC VDSBDatOSflA PfHGQFLAGT PPPItSMSCNP SPCBAnQVEP WSQCSRSCGG 1440 

GVQSEIGVFCP GGLCDI9TKRP TSTMSCNEIIL CCmtilGtSVtD LCSTSOGOdF QKRXVCCVP8 1500 

BGNKTSDODQ CZiCDHKPRPP EFKKCEIQOAC KKSWDLIiCTK DKLfiASFOQT I^KAMKKCSVP 1560 

TVXtAECCF&C PQIHXTRTQR QKBQBZiLQKG KEIt 1593 

9eq TD mOt C37Q Protein fiequence 
Protein Accwsion II « Iip_00i053.l 

1 11 21 31 41 51 
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11)111 

MRQSUQIjPIjV GliLFSPTPS QLCEICBVaE EHYIKUCPLI* HTHIQSNYNR GTSAVBlWIiS 60 

XiKI>VGXQIQT LHQKMIQQIK YDiVKSSIi&DVr SSGBI>&I>IIIi ALGVCRNAEB HI^IYDYHZiTD 120 

KLSNKFQAEI EHHEAHHGTP I.TNyYQI.SIJ7 VIAZtCLFIflGH TSTAEWHHF TPEHBXnCYFG 180 

9QFSVDTGAM AVLAl«TC\nCK SLINGQXKAD EGSLKNZSIY TKS2>VBKII«S BKKBHGLTGI? 240 

TFSTGSAMQA. XiFVSSDYTNE KDKNOQQTIJ7 TVTaTEISQGA FSNFIOAAAQV LPAI^MCSKTfL 300 

DINKDSSCVS ASGtOFNISAD EPITVTPPDfi QSYISVNYSV RIKErYFTHV TVtNGSVFJ^ 3 60 

VMBKAQKMtID TIFGFTMBER 5H6PYITCIQ GLCAISINHDHT YHELLSGG&P LSQG7168YW 420 

RNGBCOiEVRW SKY 433 

Seq ID NO I C371 ProLein Sequence 
Protein Accession #> HP_004?82.1 

1 11 21 31 41 51 

I I I 1 1 I 

MCCTKSLLILA ALMSVZiZjUIIi CQESSAASNF SOCLGYXDRI LHPKFXVGPT ROLAEIBGCDZ 60 

HftllEHTKXK IiSVCANPKQT WKVIVRLLS KKVKHK 96 

seq ZD NOx 0372 Protein Sequence 
Pax>t:eln Accessian #s np_0374Q3.i 

1 11 21 31 41 51 

I I I I I I 

HftGSPLLNGP WHQGWShrMyh UULFRPPP AIiCARPVKBP BGL8AASPFL AETOAPBRPR 60 

SSVPRGEAAQ AVQHIxARALA HLLBAERiQER ARAEf^BIUED QQASVIAQLL RVWQAPBHSD X20 

PALGU^DDFD APAAQXARAIi ZiIuaUiDPAiai AAQLVPASVP AAAURPKFCV YDD6PAGPDA IBO 

HBAODBTFDV DPCIiIJiyiiLG RILAGSAD3B OVAAPRHIAR AKDHDVGSEL PPBSVZ^QHUEt 240 

HViORLBTPAP QVPAHSIiIiPP 260 

Seq ID NOs C373 Pcotelxi Sequence 
PEDteiA Accession #: NF_0 02236*1 

1 11 21 31 41 51 

I I I I I 1 

MLQSIiAGSSC VRLVWXRSA NCFGFEiVLOT ttEiYLVPOftW PSSVEIjPYED i:>XkI^ELIlXE.K 60 

RRFLBBHBCL 8EQQI1BQFLO RVIiEASNYGV SVZ.SHASGNW »WDFTSAI.FF A5TVLSTT6Y 120 

GHTUPLSDGfi KAFCIIYSVI GIPPTLIiFIiT AWQHITVHV TRRPVLYFHI RWGFSKQrVVA ISO 

ZVHAVLL6FV TV9CFFF1PA AVF^VTiEDDHT NPLBSFYFCF ISIiSTIOLOD YVPGBGVMQK 240 

FRBLYKXQXT CVIiIiLGI^IAM IiWI£TFCBIi HBLKKIBKMF YVKKDKIX3)Q VHIXKHDQLS 300 

FSSI7DQAAG MIOSDCKCONEP FVA7QSSACV S6PANH 336 

&eq ID NOi C374 Protein Sequence 
Protein Acceaalon #s 3IP_005463,1 

1 IX 31 31 41 51 

I I I I I I 

MBTTNGTETW YESI^VXrKA liNATLESEfM. CRP6PGLGFD NQTBERRASli FGHISDffSYHY 60 

IlfFVMFZiFAV TVGSblLGyr SSSKVPHRSD PYUVYXXHKV SNI 103 

Seq ID IBO s C375 Protein Sequence 
Protein Tkcceeelon #1 np_005236.1 

1 11 21 31 41 51 

1)1111 

MGSHLAIiIfUE. LliriFQHFGD BDGSQRLBQT PLQFTHIjHXM VTVQBHSAAK TYVGKPVKMG 60 

VYXTEPAWBV KYKTVSGDSB HZJVAEEYII. ODFCFLRIHT KGOrrAItiNR SVKDHXTLIV 120 

KAIiBKNT^rVB ARTKVKVQVIi DTUDLRPLFS PTSYSYSIiPB NTAIKTSXAR VfiATDADIGT 180 

HGEFYYSflCD RTDMFAJHPT SOVTVUTGRI. I>YI.ErKLYH4 EILAADRGMK ZiYOSSaififiM 240 

AKUrVHZEQA UBCAPVXTAV TZiSPSELDRD PAYAIVTVPD CDQSHMC3DXA SLSIVAGDIaD 300 

QQFRTVRSFP G8KBYKVKAI GDZDHDSHPF SVHIiTLQAXD IDSTPPQFSSV ICVZBVTfiFQF 360 

imaPVKPBKD VYHABISEFA PPMTPWHWK AIPAYgHLBY VFKRTPGKAK FaiiHXHTGI.1 420 

8II1BPVXCRQQ AAHFSUBVTT 5DRKA&TKVL VKTLGANBNF PBFTQTAYKA AFDESIVPIGT 480 

TXHSIiSKVDP DB6BNGYVTY SIANUmVPF AIZ3BFTaAV& ITSEHIiDYBIH PKVYTLIIZRA 540 

SDOIGIfFXItRB VBVZJUrXTUI XTUflDBlXPZiFB KXHCEGTIPR DLOVOBQlTX VSAIDADBLQ 600 

I.VQYOIBA16IT BLDLFSZiElPN 9SVI>9XaCR9I. HDQLOAKVSF HSL1tXTATD9 BiaFAm>l>YIlT 660 

ITVAASHRJiV N^QCBBTSVA KHliA£KU>QA HKIiHnQGEVB DIPFDdH&VH AHIPQFRSTXi 720 

PTGXQVKBIIQ PVG8SVIFM1? STDLDTGFIiKS KLVYAVSGGH ED8CFMIDMB TOMUCXL&PIj 7eO 

I^RBTTDKYTIi NXTVroi.aXF QKAAWKLIAV VWDANSNPP BFlbQSSYFVB V8EDKBVHSB 840 

IIOKBAaZlKD tiSraiGHyTyS ILTDTDTFSI DSVTawmA SPLDSELQBB RSZ.XXBIIBDQ 900 

ARBSmifST VWZVSIiEDV NDNPPTFXPP SYSVKVREDL PE6TVIKPII1B AHDBDDGQG6 9€0 

QVRYSXiXJmG E133FDVDKI.S <3AVS1VQQLD FEECKOfVYinyr VSAKDIQSKPV SLSSTOTVEV 1020 

EWDVNBSTLH PPVFS8FVEK GTVK03APVG ^VHTVSAHD EDAGRDQBJR YSlRDOfiSVG 1060 

VFKIGEETGV IErSDRU)RB STdBYULTVF A7DQGWFI>a SFIElYIEVB DVMCHAPQTg 1140 

BPVYYPBXMB ST&FKDVSWQ IEAFDPDSS9 NDKUtSCKITS GKFQGFFSZH PKTGIiXTTTS 1200 

SKU:>REQQDB RIIiBVTVICaZ GSPPK&TIAR VIVKIU3&HD IIKPQFIK2KFY KIIUjPERBKP 1260 

DRESNASSEP IjYRVIATDKD BGPNAEISYS XEDGNEHGKF FIEPKTGyVS 5KRFSAAGEY 1320 

bXLSIKAVDN GRFQKSSTTR miEWXSKPK QSLEPISFEE SFFTPTVHBS DPVAHKIQVI 1380 

sveppqzpijH fditgovyds hfdvdkqtot xxvakpi^dab QjKEOinaihTVB ATzxarmiiT 1440 

QVPIKVZDIN BH&PQF8TSK YEWIPGDTA PGTBZCdQX9A VDQDBKMIOjZ YTXiQSSRDFIi 1500 

SltKKFRIiDPA TG8tiYTSBKI> DEEEAVSPAHZi TVKVRDQ|DVP VHRKFARIW KVSSTBDBAP 1560 

KFTA89YKGR VYESAAVGSV VLQVTAItDKD KOimAEtfliYS lEfiCSnXGHIG KSFKIPFVZ<a 1620 

SXKTAKEU)R SNOAEYDLMV KATDKOSPPH SE1T8VRIFV TIADHASPKF TSKBYSVELS 1680 

BTVSIGSFVG HVTAHSOSSV VYEZZDCHTG DAFDXHPHSG TXITOKAU)F ETLPIYTLXX 1740 

QGlMNAGliSI ISTTVIjVHLQiD EI3DNAFVPHQ AEYIGLISBS ASZHSWUTD RUVPLVIHAA 1600 
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DADKDSEitAljIi VyHXVEESVH TYFAIDSSTG AIHTVIsSLDY EETSIETlPTV QVHDMGTPRL 1860 

FABYAftNVTV HVIDXHDCPP VPAKPIjYE2VS LIiIjPTYKGVK VITVMATDAD BSAFSQLIYS 1920 

ITBGNlGEKf SMDYRTGALT VQKTTQIjRSR YELTVHASDG WiVSLTSViCI NVKESKESHL 1980 

KPTQUVYSAV VJOENSTEAET lAVITAlGSP IHBPLFYHri* HTOHRPKiaR TSGVI>STTGT 2040 

5 PPSREQQBAF DVWEVIEEH KPSAVRHWV KVIVKDQNDW AFVFVMliPYY AWKVDTBVG 2100 

EVIRYVTAVD RDGGEUiaEVH YYUEBRHBSF QIGRLGEZSI* KKQFEIiDTUT KBYLVTWAK 2160 

DGGNPAFSAE VIVPITVMBIK AMFVFSKPPY SABIAESIQV HSPWHVQAN SPESItKVFYS 222 D 

l-TDGDPFSQP TINFirrGVIN VIAPIJDFBfffl PAYKLSTRRT 06LTG?iHAEV FVDirVDDIN 2280 

DNPPVFAQQS YAVTLSEASV IGTSWQVRA TDaDSEPNRG ISYQMFQNHS KSHEJHFHVDS 2340 

10 STOI>ISI.IjRT LDYBC^SRQHT IFVRAVDGQM FTIjSSDVIVT VDVTDIiNGSTP PLPBQQZTBA 2400 

RISEEftPHGH FVTCVKAYDA DSSDIDKLQY SZLSGHQHKH FVIDSATG3I TI.SHLHfiHAL 2460 

KPFYSLHIiSV SDSVFRSeTQ VHVTVTOGHL HSPAFLQHEY EVELAEHAPli HTLVHBinCTT 2520 

DGDfiSIYGHV TYHIVNDPAK D&FYINERGQ IPTI»EKI*DRB TPAEKVISVR IMAKDAGGK9 2S80 

AFCXVNVII»T DDNEMAPQpa ATKYEVNICS SAAKGrSVVK SASDADBOSN ADITYAIEAD 2fi40 

15 £BSVK£1II.BZ mL&SVJTTK BSIiIQIiEN&P FTFFVBAVDM GS^SKBSWl* VYVKZIiPPBM 2700 

QLPKFSEPFY TFTVSEDVPV GTBII3LIRAB QSOTVIiySIiV KGNTPBSNBD S8FVIDRQ8S 27^0 

HLKLSKBLDH BTTKI7YQFSI IiARCTQDDSE KVASVDV^IQ VKDANDNSPV FE9SPYEAFI 2820 

VEOTiPGOSRV IQIRASDADS GTWGQVMYSI* DQSQSVEVIE SPAINMETGW ITTLKBLIJHB 2980 

KRtJNYQIKW ASDHGEaciQl. SSTAXVITVTV TDVNDSPPRF TASIYJKGTVS EDDPQOQVlA 2940 

20 ILSTTDADSE SraBQFVTYFI TGGDPIA3QFA VETIQKSWKV YVKKPLDRBK EDNYLLTITA 3Q0O 

■nJOTPSSKAX VEVKVLDAHD NSFVCBXTLY SDTIPBDV&P QKUMQISAT DADZRSHABZ 3060 

TyTLLGSGAB KFKZiHFDTGE LKT8TPLORE EQAVYHLliVR ATDGGGRFCQ ASIWTLEDV 3120 

HiaiAPBFSAD PYAITVFENT EPGTLLTaVQ ATOADkAGMIR KILYfililDSA PGQPSIHELS 3180 

GIIQI»EKPIJ5 RBLQAVYTLS IjKAVDQGIiPR HIiTATGTVTV SVUJINDMPP VPEYREYGAT 3240 

25 VSEOaCLVGTE VLQVYAASRD JEAHASITYS IISGNEHGKF SIDSKTGAVF ZTSSO^ESS 3300 

HEYYCtTVBAT DGSTPSL&IDV ATVNVHVTDI HENTPVFSQD TVTTVISEDA VLBQSVITVM 3360 

ADDADGPSHIS HXHYSZIDON RGBVKVTKUi mETISGlTTIi TVQASDEK3SP 3420 

PRVNTTTVHI DVfiDVMDElRP WSRONYSVI IQENKPVGFS VLQLWTDED S8HNGPPPPP 3480 

TIVTGNDEKA PBVNPQaVU. TSSAIKRKEK llHYLIiQVKVA DNGKPQLSSli TYroiRVIEB 3540 

30 SITPPAIIjPI. BIFIT&SGEE YSGGVIGKIU ATDQDVYDTI* TY&IiDPQUDH ItPSVSSTGGK 3600 

LZAKKKLDI6 QYIiUIVaVTD GKFTTVADIT VHZRQVTQrat IHHTZAZRFA 1II.TFKBFVGD 3660 

ynSNPORAZA SXLGVRRNDI QXVSXiQSSEP HPHtiDV&DFV BKPGfiAQIfiT XQLIjIiSTNSS 3720 

VTDISEIIGV RILMVPOKLC AQIiDCPlrlKPC DEKITSVDESV MSTKSTAHItS FVTPEHHRAA 3780 

VCLCKBGRCP PVHHGCEDDP CPEGSECVSD PVfEEKHTCVC PSORPtSQCPG SSBMTI»TGNS 3B40 

35 YVKSfHLTEHE NKLEMKLTMR tRTYfiTHAW MTARGTDYSI LEIHHGRIiQY KEDCGSGPGI 3900 

VSVQSIQjVND GOHHAVAliBV MGNYARUVLD OVHXASGTAP (3TUCTUIU» YVFFOQHZRQ 3960 

QGTAHGRSPQ VGSK3FSQCMD SXYtNGQELP LKSKFRSYAH IBSSVDVSPO CFI«TATEDCA 4020 

SNPCQNGGVC MPSPAGGYYC KCSAIjYIGTH CEIfiVNPCSS MPCLYOOTCV VDNG6FVCX1C 40BD 
RGI^TGQROQ LSPYCKDEPC KSlGGTCPDSEr DGAVCQCDSG PRGERCQ3DI DECSGMPCUI 4140 
40 GAIiCElSXHaS YHCNC&HBYR GRHCBDAAPH O^S^PHHIG IiAEQZOZWF VAGIFLLVW 4200 

FVIiCRKMIfiR KKKHQABPXD KHLQPAXRPL QRFYFDSKIiH KmYSDIPPQ VFVRPISYTP 4260 

SIPSD8RIDII> DSHSFBGSAX PEBPEFSTPH PB5VHGHRKA VAVCSVAF»I> PPPPPSN8PS 4320 
BSDSIQSCP&tr DPDYDTKWD SJ3PCI.SKKFX. EHSCPSQPY9A BB5LSEVQSL SSFQ9BSC33D 4380 
, NSSBMnrSDn HP3VPLPD1Q KPPNYEVIDB QTPI.YSADPN ArDTnYYPGa YDIESDFPPP 4440 
45 PEDFPAADBI* PPI*PPEF6NQ FESIHPPRDM PAASSIiGSSS RNBQRFNUNQ YLPEIFYPJJIH 4S00 
SBPQTXGTGB NSTCREEUAP YPPGYQEtHFE APAVE9HPKS VXA8TASCSD VSAOCEVB&B 4560 
VHKSXTYESGD SGHEBBVlMP PLDSQQUTBV 4590 

^ Seq II> NOt C376 Protein Sequence 
50 Protein Accession #t NF_05S03B.l 

1 11 21 31 41 51 

I I 1 t I I 

MCYGiCCARCI QHSIjVGliALtj CIAANILLYF PNGEUnfABB 2flHL8RFVWPF SQIVGGGLIM 60 
55 IJ^PAFVFIGL BQDDCC3GCX:G HENCGKRCAM IiSSVLAALXG ZAfSSGYCVIV AALGIAEaPl. 120 
CLDSXriGOnm TPASTBGQfYL IJ>T8TW3BCr BPKHTVBHNV SLPSJZiLaZJS GXEFILCLZQ 180 
vmSVliGGrC GFCC8BQQQY DC 202 

Seq ID WO: C377 Protein Sequence 
OO Protein Accession 4»5 1IP_003750,I 

I 11 21 31 41 51 

1111)1 

WSTHOVEGKP filOLGERGRAR GSTFIiKWQP MFMHSIFTSA VSPAAERIRF ILGEBDDSPA 60 

65 PPQI^FTBUDB UiKVDQQrafB WKEIARHIKF EEKVBQGGBR WSKPHVATZ.? IJISI»FEURTC 120 
. HBHOSZIODR EASELPQLVB MZVUBQIBTa U[jKPBLKDKV TKTUJ2KHRH QIKKSHURSI. 160 
ADXGKTVSSA SRNFINPDHO &FAKXHRNLT S9SUID1&DK tOCKLPRDABA 240 

SEIVIiVGEVDP LDTPFIAFVR LQQAVMLOAI. TBVPVPTRFI* FH^LGPSGKA KSYHBTORAX 300 
ATLM8DEVPH DIAYKAKHRH DIjIAGIDEFL SEVIVIiFFQB WDPAXR7EPP KS£iPSSDKRK 360 

70 SKYSGGENVQ MN6DTFHDGG BGGGGHGDCB HliORTGRFCS 6£»ISDIKFKA EPFABDFYTA 420 
XMZQftZiSAIL FIYIiATVTNA ZTFOGbLGDA TCONMQGVZiES FU3TAVSGaX FCE^AGQEUT 4 BO 
TL8ST0PVE*V PBHXiX«EHFSK DEDilPDYZiBFR IiWlGLWfiAPb CLICVMnAS FLVQYFTRFT 540 
BEGPSeiilfiF IPrYI>AFKKM IKIADYYPIN SBIPKVGnJTt PSCTCVPPDP ANISISNDTT 600 
LAPEYLPTMS STPKITHNTTF BWAFLrSKKEC SKYGtaTLVGN NCHFVPOITIi M&FIIiPIiGTY 660 

75 TSSHALKKFK TSPYPPTTAR KLZSDFAIZIj SZLIFCTIDA ItVBVDTPSIiI VPSEFKPTSP 720 

NRGHFVFPFG EMPimvCKAA AlPALI.VTZIt im)QQI!EAV XVNRKEinCLK ZGZUSYHLDLF 780 
WAXXMVXCS UtRLPHYVAA TVISIAHIDS UCMETBTSAP QBQPKFLGVR BQKVTGTLVF 040 
II/TGLGVFMA PXUCFXPHPV liYGVFliYMGV ASIiNGVQFMD RL1CLLLKPZK EQPDFXYLRH 900 
^ VPUtRVHLFT FLQVLCXALtt niI<K9TVAAX XFPVMXIiALV AVRKGNDYLF SQHDIjSFLDD 960 

80 VIPEiCDKKKK EDEKKKKKKK GSLDSD11DD8 DCPYSBKVPS nCXPMDlMEQ QPFL&DSKPS 1020 
DREBSPTFliB RHTSC Z035 

Seq ID NOs C378 Pxiotein Sequence 
Protein AcceBOion ff: NP_000949.1 

1369 



wo 03/042661 



1 11 21 31 41 51 

11)111 

MSTP0VN93A SLSPDRLNSP VTIPAVMPIP C3WGNLVAIV VLCKERKEQK ETTFTCTLVCG SO 

lAWlDhUSTli LVSPVTIATY MKGQWPC3C9QP IiCEYSTFII*!* FFSZiSGIiSII CftMSVESYIA 120 

ZVaSUTFySHy VDKRLAGtiTCi FAVYASNVLF CAI>PHHGLOS SRLQYFDTHC FIDnTTNVTA IBO 

KAAYSYHYAG FSSFZiIIiAW LCI3VIiVQOAI* LRMHRQFMRR TStXSTBOHHA AAAAS7ASRG 240 

HPAASPALPR LSDFRRHRSF RHIAGAEIQK VILLIATSLV VI^ICSIPLW HVFVNQLYQP 300 

SI.EREV3KNP DLQATSIASV NPILDPWIYJ I^LRKTVLQKA lEKIKCLPCR lOGSRRBRSG 360 

QHCSDSQRTS SAMSGBSRSF ISRBLKEISS TSQTLLPDIjS IiPDIjSSHGLG GBNLLPGVEG 420 

E48LAQEOTTS TJRTIAISETS DSSQGQDSES VUi-VDBAGGS OSAGIBAPKGS SLQVTFP&ET 4B0 

miaSIKKCl 488 

3eq ID HOi C3 79 Protetn Sequence 
Protein Accession #z NP^002650,1 

1 11 21 31 41 51 

i I ) I 1 1 

MGHPPLLPIiL LUaJTCVPAS WQUiCMQCKT ITCDCRVEECA DGCpIiCXTTl VRU?EEX3EEL 60 

BI«V£KdCTHS SCEMRTLSYK TGLKITSLTB WCQLiDIjCNQ GNSGRAVTYS RSRYIiECISC 120 

GSSDM6CEKG KHQSLQCSISP HHQCIiDVVTH WIQBGEBSRP KDDRHLRQCG YLPGCFGSHG ISO 

FHMISDTPRFIi KCCSTETKOlE fiPIIiEbBHLP QHGRQCXSCK GGtSTHGCSSB BTFX*ZDCRGP 240 

HHQCLVArTGT KEPKNOSYHV HOCATASHCIQ HAHLCTlAFSH NHZDV&GCXK GGCNHPDUTV 300 

gyRSCIAAPaP OFABLSLTIT LLMTARIiWGO TLLHT 33S 

Seq ID SDx C380 Protein Seqiuence 
Protein AcceBelon Iti BAB55406,1 

1 11 21 31 41 51 

I 1 I I I I 

MDFE9GQVDP lASVILPENI. I.EEIL6PED8V XiVBRAQFTFF ]aKXGLFQDIV0 PQRKTLVfiW 60 

HACaiGNTXI QtaiiKDPVQIK HOTIRTQEVH HPICA7WDLN KHKSFGGHHT SGCVAHRDSD 120 

A8ETVCLCNH FTBPGVLMDL PRSASQLDAR NTKVI.TFIsy IQCGISAIFS AATblVTYVAF 180 

EKUIRDYPSK lIiMSIL&TALL FLKIIiIiFU^O WITSFNVDGL CIAVAVIiIJiF FDLATFTHMG 240 

LEAlHMyXAJi VKVFNTYXRR YIIiKFCXlGW GLPAIiWSW lABRSnSSiSEVY GKBSYQKENS 300 

DBFC^QDPV XFWTCftGYF GVMFFIiinAM TIWMVQIGG RNGKRSHKrL KESVLIOdUUS 3fiD 

WSLTFLLGM THGFA7PAHS PUHIP^BHaTLF SIFNSIiQOZiF IPIFHCAKKE HVQKQnBRHL 420 

CC3GRFSZiAZai SDH6KZATNI IKKSSDHLQIC 3LSSS&IGSH STYI>TSKSXS SSTTYFKRNS 4B0 

UTDMVSVSHS FHK9G3EfiQC FHGQftfliVKTG PC 512 

Beg ID HOs C3B1 Protein Segaence 
Protein AccesBion fts NP_000565.X 

1 11 21 31 41 SI 

I I I I i I 

KTVARPSVFA ALPZJj^lIiPlL IiUiIiVMiCZtP AVWQDOQIiPP DVPHAQPALB GRTSFPEDTV 60 

ITYXCBBSFV KIPGBKDSVI CLKOSQOfSDZ EEFCamSCEV PTRZiHSASUC QPYITQNYFP 120 

VGTWKYECR POVUHEPSLS PKZjTCLQNIiK WSTAVEFCKK KSCPNPGEIR HOQISVPGGI 18Q 

IiFGATXSFfiC HTGYKLFGST SSFd^ISOSS VQW&DPLPBC REIYCPAPPQ IDKGIIQGER 240 

DHY6YKQSVT XACHIDSFTMX GEHSIYCTVN NDEGENS6PP PBCRGKSIjTS KVPPTVQKPl: 300 

TVNVPTTBVS PT8QKTTTKI TTPHAQATRG TFVSRTTiaiF BBZTPHKasa TTSGITRXJjS 360 

GOTCFTITOIi XiOTEiVTMaiai T 381 

Seg XD NO I C3G2 Protein Sequence 
Protein Acceseion #: eeguence 

1 11 21 31 41 SI 

I I 1 i I I 

KDT8RLGVLL ShWhUQlAI QQSSPRGGVL LSGCPTHCHC EPDORMDliRV DCSDLGr.SEI> £0 

P8NIi5VFT8Y LDLSMBlNISQ ZJjPNP&PSZA FIiSEIAUVSN ALTYIFKGAF 7X3LYSI.KVLM 120 

LQNNQtfRBVP TBAICTURSZi QSIAUOAHBI SYVSPSCF98 £HSE.SXIIMU> DHAUTBIPVQ 160 

AFSfiliSALOA MTLALNKIHH IPDVAraiLS SZiVVU!LHNIjr HIRSK^QICKCF UGLHSCSTUP 240 

UrnmLDBFP TAXRTIiSHIjK fiLRPYDNPIQ FVGRSAFQEH. PEIiRa:l,TXJ9G ASQXTEFPDIi 300 

mSTAMLBSLT LTG&QI8SLP QTVaaOLPNI. iJ^tDLSYKLt EDIjPSFSVCQ KLQKIDLRHN 360 

BIYKKXVDTF QQU^SliRSUI UWNKIAIIK PNAFBTIiPSl. XKIiDLSSt3LI* SSFFITGt^ 420 

. mauaaQsoi alqslxsseh fpel^vjcekp yayqcckfot c&sunamsHQ wnksdhbshd 480 

DXBKKDA0KF OAQDESIDI.BD Ft^FSSDUH AISSVQCSFS EQPFKPCERb ZdXJHZiIRZGV 540 

WTIAVLALTC NAIiVTSTVPR flPXiYISPIKI. LIGVIAAVUM LTGVSSAVLA GVEAFTEGSF 600 

ARHGAVWEKIG VGCHVXQFL6 IPA&BG8VFL LTUVA^BRGF 9VKY8AKFCT 660 

XZ^IiCAlJAIiT MAAVPIiLGGS XYGABPLCLP If FOEPSIMG YNVALIIiINS LCSPIMMTIAY 720 

TKLYCNU3KG DIiEKIHDCSK VKHIAIililiFT NOIU2CFVAF liSF9SUIIX»T FISSBVXKFX 7B0 

IiXiVWFbPAC UBTPIibYlLFN PHFKEDLVSL RnQTXVWZRS KBP&tMSlllS IlDVEKQfiCDS 840 

TQALVTFT&S SITYDI^PSS VPSPAYFVTE SCHLSSVAFV Pdi 883 

ffeg ZD KO: C383 Pxotein Sequence 
Pzotein Acceseion #s !llP_00365a .1 

1 11 21 31 41 51 

) 1 I 1 I I 

MDTSRLGVUb SLFVZiLQLAT GGSSpRSGVIt ISOCPTHCHC EPDGRMItUtV DCSDliGLSEL 60 

P&SILSVFT8y XXkZlS^BiDnSO liLPNPIiPSIA FLfiBLRLAGK AUTYIFKGAF TGLYSliKyiH 120 

LQHMQLRHVP TE»LQBlIABb QSLfiUUDiaiSHI 8YVPP8CF9G IBSUOILMLD DigAI/EEZPVQ 180 

AFRSZtSAXdQA MTZiAIAIiaBQ ZEDYAFrailjS SLWXJXLHNK HXHSZ^nOCCF DGLHSIiBTUD 240 

UmtHLDSFP TAISTLSNIK ELGFRSNNTR SXPEKAFVGN PSLITXHFVD NPXQFVGRSA 300 

1370 
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PQHT.Tipr.PTT> TIJ5GA9QITE PPpliTGXAKI* BSLTl/TGAQI S3l/PCyXVCNQ TjPNLQVLDLS 36Q 

YNXjLSDLPSF SVCQKIiQKZD LRKNBIYSIK WrPQQI.LSI> RSIJOIAWNKI AXISPNAFST 420 

I/PSLIKIJ3I^ SMliI.SSFPIT GLRGLTHLKL TGNHALQSIjI SSENFPELKV IEMPYATQCSC 450 

AFGVCEKAYK ISNQHlfKGOH SSMDDIjHKKD AI^MFQAQDER DLEDFLUJFE EDljKAIiHSVQ 540 

CSPSPGPFKP CEULIJ36WLI RIGVWTIAVIj AI,TCNA1OTS TVFRSPLYIS PIKLLIGVIA SOO 

AVNMUTSVSS AVLAGVDAFT FGSFAKniBAH nSHOVGCtiVX GFLSIFASBS SVFLI/rLtAAI* 660 

ERGPSVKYBA 3CEETKAPFSS IiKtfllLLCAL LALTMAAVPl. U»3SKin3ASP LCLPL&PGEP 720 

STKGYMVAIiI LUJSIjCFLMM TZAYTKDYCN LDKGPI-EMIW DC5MVXHIAI. TJiPTUCILNC 7B0 

PVAFI£FSSL IKLTFISPEV IKFILIiVWP LPACLNFIiIjY ILENPHFKED LVSLRKQTYV 840 

nXRSKHPSLM SINSnDVEKQ SCDSTQAIjVT FTSfiSITYDI> PPSSVPSPAY FVTESCHL8S 900 
VAFVPCIi 



Seq TD NOi C3fi4 Protein Sequence 
Protein Accession «t tlP_003497.1 



1 11 21 31 41 SI 

I I I I i 1 

HEMFTFliLTC IFLPLLRQHS UTCEPITVP RCTKMAYMMl* FFPHUraHYD Q8IAAVEHEH 60 

FIiPLANIiBCS PNlErrPLCKA PVPTCIEQltt WPPCRKLC5 KVYSDCKKLt DTFOIHWPEE 120 

I^BCDRIiQYCD ETVPVTFDPH TEFIiGPQKKT EQVQRDIQFW CPRHLKTSQG QGYKFLGIDQ 180 

CAPPCPBHYF KSDEIiBFAXS F16TVSIFCL CATLFT7LTF IiZDVUaFRYP SRPIIYXSVC 240 

YSJVSIiKYFl GFLLGDSTAC 2IKAD&KIiEL6 DTWLGSgNK ACTVItF»U.Y FFXMAGTVWH 300 

VILTITHPIA AORKW5CBAI BQKAWFHAV AHGTPGFC/TV KLIAUSKVEG XWISSVCFVG 360 

IjYDUIASAYF VLLPIiCLCVF VCLSUiLAGI igtNHVRQVI QEDGRNQERIi KK^IRIGVF 420 

SGLYlOTIiVT IiIiGCYVYEQV NRITWBITWV SDHCROYHIP CPYQAKAKAR PELftUSWlKY 4 BO 

LKTI*IVGISA VFWVGSKKTC TSHAGFFKBH SKSDPI^SR RVIKIBSCBPP EiKHNSKVKHK 540 

KKHYKPSSEK UCVZSKfiMGT STQATANHGT fiAVAITSHDT LGQ&TLTBZQ T5PBTSMZ2EV 600 

KADGAfiTPfiL REODOQEPAS PAASISRXiSS EQVDGKGQaG 8VSESASSB6 RXSPKSDZTD 660 

TCIiAQSNiaLQ VES88BPSSI1 HOarSLLVHP VSGVRXEQGG GCESDT 706 



8e<3 ll> HOi C385 Protein secpience 
Protein Accession fts HP^OOOS'73 

1 11 21 31 41 51 

I I 1 1 I i 

I61ZAVICFCL L6ITCAIPVK QADtSGBSBBK QIiYNBXPDAV ATHI^HPDP&Q KONMAPOTIi 60 
PSKSNBSHHEX MDDMDDBDDD PHVDSQDSXD SdDSDDVDDT DDSEQSDBSS HSEIESDSIiVT 120 
DFPTDLPATE VFTPWPTVD TYDGRGDSW YGUlflKSKKP RRPDIQYPttfV TDEDITSHMB 180 
S5ELNGAYKA IPVAjQDLNAP SDWDSRGKDS YETSQIiDDQS AEIHSHKQSR liyKRKA>n>E8 240 
HEHSDVLDSQ ELSKVSREFH 5HEFHSHEDM IiWDPKfiKEE DKHIiKFRIfiH ELDBASSEVN 300 



9eq ID MOt C386 Protein fieqaence 
Protein Acceasion KP_Q02812 



1 11 21 31 41 SI 

I i I I i I 

MGAARGSPAR FRHLPLLSVIi LLPLLGGTQT AIVFIKQPSS QDAEiQSRBAL lACBVBAPaP 60 

VHVYKUiCaA ^VC^TERRFA QOSSLSFAAV DRLQD9GTFQ C!VARX)iDVTGB EAR8AliZA8FEI 120 

IKWIEAGPW IiICHPASEABI QPQTQVTURC HIDGHPHPTY OWFREQTPLS DGQSNnTVaS 180 

JCEEHIiTEiUPA GP&HSGCiYSC CAHSaPGQAC S8QNFTI>3IA DB&FAKWIA PQDVWARYS 240 

BAMfHOQFSA aPPPSLOHItF BDBTPimRS REFHLRRAW FAHGSUUiTQ VRPRKAGIYH 300 

CXGQGQRGPP IILBATUilA BIEDMFICTP RVFTAQSEBR VTCLPPlOStiP EPSVMWEHAO 360 

VRZaPTBGRVY QSOHBLVIJVN lAESDAGVYT CHAJaiLftGQR RQDVNITVAT VP8fflKKFQD 420 

SQLEBGKPGY UDCLTQATPK PTWWYfiNQM IiISEDSR^EV PKHGT1»RXHS VBVYDGTWYR 480 

CH8STPAGSX BAQARVQPVI>B ECLKFTPPPQP QQCMEFDKBA TVPCfiATGRE KPTIKHEELAD 540 

OSSLPBHVTD HAGTXiSIFARV TBIXDAGIIYTC lASnaPOGQI RABVQUTVAV FITEKVEPER 600 

irCVYQGarAEi Z^QCBAQGDPK P!LIQB9KSBI>R IIiDPTKIfGPR HHZFQHG&LV IBEIVAPEDSG 660 

RYTCIAGHSC NlKarBAPI.Y WDKPVPEEB EGPG8PPPYK KIQTIGLgWS AAVAYIIAVIi 720 

GEMFYCiCKRC KAKRZiQKQPB GEEPEKECLN GGPLQKGQPS ASIQEEVALT SliGSGPAAlK 7B0 

KRE3TGD8MU FPRSSM2PZT TLGKBSFGEV FIAKAQGliEB GVAEXEiVIrVX SLQTKDEQQQ 840 

LDIFRKEI^EHF GKUlHIUaWR EiLQLQlEABP HYKVIiEyVDIi QDLlGQiFIjRIS KSKDSKUCSQ 900 

PEiSTlQQEmKL CTOVAIiGMEH ZjfiNHRFVHKD XAASMCItVfiA QROnnCVSOOiG L8KDVYNSEY 960 

YBFRQAHVPL RnM6PBAXi;B GDF8TK60Vlf AFGVLHRBVF THGEMPHGQQ ADDBVLADQCiQ 1020 

AGKARLPQPE GCPSXIiYRZM ORCHKLSFRD RFSF5EXA8A IjGDSTVDSKP 1070 



. ID 19D: C387 Protein Sequence 
Protein Accession #s NP_002300.1 



1 11 21 31 41 51 

1 I I I I t 

MKVLAAGVVP XJJiVZasnCHG AGSPIiPXTFV MATCAIKHPC BNHIMSQIRS Q&AQUOGSA» 60 

AIiiFZXiyYIAQ GBPPEEINIJ^K haSPOVTOt^ PEHKNOTEKA KItVELYRIW TCUSLBSUSSTT 120 

RDQXUillPSA LSIiHSSXdlAT ADILRGLLSN VLGRLC8KYH VGRVDVTYGF DTSGKDVFQK 180 

KKLGCSQLIGK YKOXIAVZAQ A? 202 



Seq ID KIDS Protein Beqpience 

Protein Accession if t XP_09750B 

1 11 21 31 41 SI 

I I I I I 1 

MGRPRUIIaVC HVSIIZSARD LSMBIHI/CBEiQ FGLFflHLRFX* EBLKL8GNHL SaZPQQAFSG 60 
USaUXliaOK^ NHQILGGIPAE ALWEZtPSUgS LRLDAHEiISI. VPERSFBGIiS SI>RHL.W£iDDIiI 120 
AXiTBZFVBAn KIK7X>PALaMfr LALKRISBZP SXAFQ^ISL WUnffiNNRI QBLGTHSFSG 180 
Z^Hm^ETLDUil YHKLQBFPVA IRTLGRLQEL GFEHUNIKAI PEKAF»GNPZ> I^IHFYDNP 240 

1371 
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IOFV3RSAFQ YLPKUITtSL HGAMDIQEFP DI/KGTTSLBI LTlxTJElAGISL LPeGMOQQLP 300 

III>RV£>EC»SHN QJSBLF8LHR COK]j££ZGLQ HHRIHBXOAD TFSQLSSLQA LDLSWHJllRS 3€0 

IHPBA7STUI 8LVKLDI.TDH QLTTLPUlQIa OGLMULKIfKG NIiAI»SQ3^5K DSFPKLKILB 420 

VPYAYQCXSY GMCASFFKAS GQHEASDIfHL DDEESSIOWL CLIiARQABNII yDQDI'QGIX}!' 480 

EMED8XI«FS VQCSPTFGFF KPCBYIiFBSW GISIAVHAIV LLSVLCSKSbV EiI>TVFAGGPV 540 

PUPVKPWO AlftGANTIiTG ISOOIJAfiVD ALTFGQPSEy GARWETQLGC RATGFIAVLG 600 

SBASVLLLTL AAVQCSVSVS CVRAYGKSPS IjQSVRAGVLG CliAtAGIAAA l»PlJiSVQEYG S60 

ASPI^diFYAP PBGQPAAL6F TVAI>VMMHSF CFIATVAGAYX lOiYCDLPRGD PSAWDC3WIV 720 

BHVAHLIFAD GLLYCPVAFI. SFASHLGXiFP VTPBAlVRSVI. LWbP&FACL UPLUTLLFNP 7B0 

KPfitXDLElBIrH PRA6D8GPIA YAAAGELEKS SCDSTQKLVA FSDVDLIIiEA SEAGRPPQUB S4D 

TYGFP9VTI>I SCQQPQAPRI. BGSHCVEFECS NBFG19PQP8M DGBLLUOEe 5TSAGGGL86 900 

OQOFOFSGLA FA&mr 9 IS 



^ ^ Seq ID NO; C^ad Protein fiequence 
13 Protein Accession 4: np_5709O1 



1 13. 21 31 41 51 

1 I I 1 I I 

HASLVSLELG XiLTiAVXtWTA TASPPAOZiZiS LIASGQGUiD QBALGGl^HT IAD(RVHC!EN6 60 

POGKCIiSVED AUSEiGBPDGS GLPPSFVLBA RYVARIiSAAA VIiYIiSKFElSl CEDlllAGItWA 120 

SHADHE.IALL ESPKALTPGU SKbLQi^OAR AAGQTPKTAC VDIPQLLEEA VGA0AP(?SAG 160 

GVIiAALLDHV RSQSCFHAIjP 9PQYFVDFVF QQHSSEVPHT LAELSALMQR L6VGREAH8D 24 D 

HSBSHRGA8S RDPVFLISS8 NSS9VHDTVC I«SnSDVHAAY GLSEQAGVTP BAHAQIiSPAIi 300 

I^QQC^OACT fiQS&PFVODQ LSQSSRYXiYG SIAIUjXCIiC AVFBErlilAVC TGCRGVAHYI 360 

LQTFLSLAVG ALTGDKVIjHIk TPKVIiOIiElTH SBESL&PQPT HRLUWAaL YAFFLFENIiF 420 

1ILI.I.PBDFED I.EDGP0GH88 KSRGGHSHGV SLOLAPSSLR QFKPPRBG6R ADLVABESFB 4B0 

LLKSPBFRSOjS PELRIiLFYMX TLODAVHNFA DGLAVGAAFA SSWKTQIiATS LAVFCHBLPH 540 

SLGDFAALIJl AGLSVRQAItli LNLASALTAF AGIiY^aUlVa VSEESBAY9IL AVATGI1FZ.YV 600 

ALCDKLPAHb KVBDPSPttLL FIsLHNVOIjLG GWTVIiLbLSX* YEDDITP 648 

Seq TP NDt C390 Protein Segiiencfi 
Protein Accession. Is NP_0ei844 



^. 1 11 21 31 41 51 

35 I 1 I I 1 \ 

MAKASEFGGS GGGEAAAXiGIj KIiATLSIiLLC VShMSISVl^A LLTVRER8LH RAPYYBLLDIi SO 

CbADGLSALA CLPAVMIiAAR SAAAAAOAPP GALGCKLIiAF LAALFCFHAA PLXiUGVGVrR 120 

YliAISHHRFY ASSIiAGWPCA AMLVCAAWAL ALAAAFFFVI. DaOGODBDAP CAZ.EQHPDGA 180 

PGKLGFIjLXiIi AVWGATHLV YIjRULFFIHD BRIOiaPAKIiV PAVSBmrrFB QPQATQQAAA 240 

SHTAaFGRaP TPPAXiVeiKP ASPGaSMtRL LVLEBFKTEIC SUSKFltAVT IiLFLLIMGFY 300 

WASXLSVZAr KPGAVPQAYX. TASVWITrFAQ AQXHPWCFL FNRELRDCFR AQFPGC3QSFR 360 

TTQATBFC3>£i HSICIL 375 



Seq ZD KG I C391 Protein Sequence 
Protein Accession 4 s HP D05622 



1 11 21 31 41 51 

I 1 I I I 1 

HAAARPARGP BIiPLLGLXiLL LLIiGDPGRGft ASSGNATQFG PaSAGGSAZtlt BAAVTGPFPP fiO 

LSHGGRAAPC EPUOHVdiG SVLEYGAXST LLAGDSDSQS EKHGBIiVIiWS GliRHAPROIA 120 

VIQPLLCAVY KPKCEHDRVE ItPSRTLGQAT RGPCAZVBRE RGWPDFLRCT PDRWEOCTR IBO 

^QiSfXKPNSa GQCEVPLVRT DNFKSWYEDV BGOGIQGQNP LFTKAZaQDM HSYIAAFGAV 240 

aX3LCTi;PT]jA TFVADMRHStl RYPAVILFYV NACFFVOSIG HIiAQFMDGAR BEIVCaADGr 300 

HBXCTFTSHE TLSCVIXFVZ VYYALMACW WFWLTYAMH TSFKALGTTY QPLSGXTSYF 360 

33 HLLTIII8LPFV LTVAHiAVAQ VDGDSVSGZC FVGYXBTfRYH ACTVLAPIGZ* VEiTVGGyPIiI 420 

ItQVHTIrFSlK SNHPGLtiSBK AASKIHBTML RLGIFGFLAF GFVT.ITF5CH FYDFFHQAEH 4B0 

SRSFRITYVIiC QAHVTIGLPT RQPIFDCBIK »RP3ZJ»VSKI I9I.FAMFGTGI AMSTHVWTKA 540 

TXtliinBRTNC IU<TaQSDDEP KRIKKBKKXA KAF8KREELL QNPGQELSFS HHTVSHDGPV 600 

AGLAFDUNEF SarFVSSAffAQ BVTKMVAKRQ AnfODZSVT PVASPV^EE QANLHLVEAB 660 

ISPELQKSIiS RKECXRSKBXIC SVCPIAPPPB IiBPPAPAPST IPSEEIiPaUPXlO KCLVAAGAHG 720 

AGDSaiQGAW TLVSHPFCPE PSPPQPPFCiP SARAPVAHAEI (aXRQSLGPXH SKIKUSDTBL 7B0 

788 



60 



^ Seq ID hDs C392 Protein Sequence 

03 Protein Accession ftt BAC04382 



1 11 21 31 41 51 

I I I I I 1 

MGARSGARGA LLXALXiLCHD PJOiSQAGRKR SSBVIMSFP SAPftSPEiPYF ZfflPQDAyiV 60 

KSIKFVEljRaR AFPATQIYFK CNt3B»VSaED3 WrXQ^GtSDEA TLGASGGURV RBVQIBVGRQ 120 

QVBBLFGLSD YWGQCVAVSS A6TTKSRRKY VRIAYXfilOIF CQEPL6KEVP LDHEVXiLQCR 180 

PPB6VPVAEV EWLKNEDVID PTQDTNFIiXiT IDEKLIIRQA RIiSDTANYTC VAKEaVAKRR 240 

SrCTATVXVYV NGG«&S0fAB9f SPCSHSGGHG WQERTRTCIM PAPLSTOGAFC EGQAFQKTAC 300 

TTICPVDGAW TBMSKnSACS TECAHHRSRB OtAPPPQHGG RSC9GTZaJ>S 1CNCCD6LCKQ 360 

NXKTZiSDPHS HI.I1BA6GDAA I»YAGIdWAIF WVAILHAVQ VWYSRNCRD FDTDTTmSSA 420 

AI,TG6FHPVN FKTARPSHPQ CiI^aFSVPPDI. TASAGXYRGP VYALQDSTDK IPHTHSPIiLD 480 

PliPSItEVEVY SSSTTGSGPG IiADGIlDLLGV LPFGTYPSDF ARDTHFI>aL1l SA8LQSQQLI1 540 

QLPfiDPGSSV SGTFGCU3G& ZiSrPQTGVSI* ZiVRHGAlPQG XPySHYUtlH KABSTLPLSE 600 

GTQTVZiSPSV TOSPTGLLLC RFVILTMPBC ABV8ASDWIF QLKEQKHQQXl WBBWTIiDBE 660 

oU nmscicciL bpbachili.d qlgttvftge sysrsavkrii oiavfapai.c: tslbyslrvy 720 

CLEDTPVAIiK EVLELBRTLG GYXiVBBPKPI^ MPKOSyHETItR I>8l£DLSHAH ITRSKHiIAKYQ 760 

EIPFYHIWSG SQKAmCTFT LERHSIASTB DTCKICVRC3fV BGBGQIFQLH TTLRETPAGS 840 

LDTIiCSAFQS TVTTQPEiGPYA FKXPr^IRQK ICNSIiDASHS RGNDHRHLAO KIiSMDRYIiKY 900 

FATKASPTGV 1U3UHEALQQ DDGdUVSZAS ALBBMK^eSBM I>VAVATDGDC 950 
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Seq ID HO; G393 Protein Sequence 
Pxoteln Accession ft: NP_004616 

1 11 21 31 41 51 

I 1 I I t I 

KNRKftStRCLG HLFLSLG9WY I.RIG6F5BW ALGRSIICNK TPOUVPRQRA ICQSEPDAII 60 

ITZGElGSQMGIi DECQPQFBKG RWNCSALGER TVPCKBUCVO SREaiAFTYAI lAAOVAHAlT 120 

AACTQGWLSD CGCDKEKQGQ YHaDEGWKWG GC3ADIRTGI GFAKVFVDAE EIKQKRRTLM 130 

SXLHNNEAGRK IXiEENMKIiBC KCHGVSGSCT TKTCWTTLFQ FRELQYVIjKD KYNEMTHVEIP 240 

VSASaNlCRPT FLK1KKPL8Y RKPMDTIXLVY lEKSKiyCEE DFVnGSVGTQ GRACmCTAPQ 30Q 

ASGCDLHCX36 RGlfHTHOYAR VHOCKCSCFHW CCWKCSITCS ERTEKVTCK 349 

Seq ID NO: C394 Protein fiequenco 
Protein Accession i); NP_003777 

1 11 21 31 41 51 

t i I I ) I 

MDALOGSQEL GS3CFWDSl«iS VHTKMPDIjTP C!FQHSLLAWV PCIYIAWALP Cyi.T.YU?HHC 60 

SiaYZUiSEliB KLKKVLGVLL HCVSHADLFY SPHGIsVEGRA PAPVFFVTPXt WOVTEMLIAT 120 

IjI*IQYERIiQ6 VQSSOVIjIIF NFLCWCAEV t>FB£iaijIAK ABSBiaDFFR FTTFyiHFAZ. 180 

VZiSUiILACF SBKPPFFSAK HVDEHPYFET BA6FIiSRI*FF HHFTKMTVXTO YSHFLBEKDIi 240 

nSLKEEDRSQ MWQQLXiEAVI RKQEKQTARS SGEDBVIiIiGA RFRPSKFSFIi 300 

KAUATFG&S FLISACPKIil QDIiL&FlNPQ LliSlIiIRFlG NPMAPSWNQF IiVAGUfFLCS 360 

MMQSLIIOHY YHYIFVTGVK FRTGIMGVIY AKALVITSISV KEUkSTtfGElV 420 

. MDIAPFUSLIi WSAPJiQlZXiA lYPlMQjNIiGF SVIiftSVASHV IiXiIPI^HOKVA VXMRAFQVKQ 480 

MKI*KDSR1KI> MSEIIAIGIKV UOiYAMBPSF LKQVEGIRQa ELQLI1R.TAAY I^ETTrTTFTnH 540 

CSPFLirriilT LWYVYVDPN HVLDAEKAFV gVSIiEKlLRL PLNHLPQLIB NLTQAavgUK 600 

RIQQPLSQEE IiDPQeVERKT ISPGYAITIH SGTFTWAQDli PPTLHSLDIQ VPKOAliVAW 6GO 

GPVGGGKSGL VSALIiQEKBK LEGKVHMKOS VAYVFOOftHI QNCTLQBNVIi FGKWHFKKY 720 

QOTl^EACAZiL ADLBMLPGGD QTBIGBSOSIN IfOGQSQBVfi XJURAVX8DK) ZFZiICD^IiSA 780 

VD&HVAKmF DHVIGPEGVL AGBTRVLVTH GISPLFQTDF IIVIADGQIVS EMGFTPAIjLO 840 

RMGSFANFIiC KYAPDEDQGH LEDSHTAI»BG AEDKBKUjIE DTIiSKfflTDLT DMDPVXYWQ 900 

KQFMRQI,SAI> SSIX3EGQC3RP VPKRHLGPSK KVQVTEAKAD GRIiTQBBKAA IGTVELSVFI9 SSO 

DYAKAVGIiCT TIAICI>I>YVG QSAAAIGANV MLSASTCNDW ADSRQ»KT8L RIX3V7AAUSI 1020 

liOGFLAMLAA KRMAAGQIQA AayiffQATiTiH NKIRSFQSFF DTTPSGRIXN CFSiQ^IYVVP 1080 

BVZAPVILMIt UY&FPKAIST bWIKASOSI. FTWILPIiAV IiYTUVQREyA ArSE^ODKiaiE 114Q 

SV&UBvnSK FeETVTGASV IRAYHRSRDF ElIEDTKtfDA H^SCYFYXI ENBMI^SIGVE 1200 

FVGNCWIiFA ALFAV16RS8 LNPGUViGLSV &YSLQVTFAI* NHMIRMMSDI. ESEmTAVSRV 1260 

KEYSXXE^TEA PHWEGSRPP EGHPPKfiEVE FRHYSVRYSP GLDIrVIiRDIiS Z£VH0QB1CVG 1320 

IVGRTGftGKS SMnid^Fftlli EAaXGBIRIB GUlVADIGIiH DLRSQLTIIP QDPIUPSGTL 1380 

12BOII£)FFG&Y SBBDXHHAI>E I.9BIfiTFVSS QPM3UDFQCS BGGBinjSVGa RQXiVCLAKAIi 1440 

UUCBRZLVIjD BKTAAlDliGT DNLIOATIRT OFDXCIVLTZ AHRLNTXMDY TRVLVIiDSlGV 1500 

VABFDSP2UXL lAARGIFYGH ARDAOX.ft 1527 

Seq ID HOi C39S Protaln Sequenc« 
Protein AcceBslan #b hp_004€17 

1 11 21 31 41 51 

I I 1 I I I 

MRASPQVCEA LZ1FAEALQI6 VCYGIRKLAL SKTPaAXAUI QTOiCKQEiBG LVSAQVQIiCR 60 

SEaUEIMSTW HAAREVMKAC RRAEASMRHH GSSIEIiAEITY LLDLEEIGTRE SAFVYALSAA 120 

AISRAIARAC TSBDLPGGEC GPVPGEFFGP GNRHGGCADN I^SYGI^LMGAK FSDAPKKVICK 180 

TGSQANXLMa LHSfSEVGRQA LSA5L£24iCCK CBKSVBGSCdl R?CKICGLQEI> QDVAADIiKTR 240 

YLSAIKWHR FHSTRKHLVP KBLDIRFVKD SELVYLQSSP DFODCNSKVG SBGTQDIBQCn 300 

KTfi9iG8I>Ea> LHCOGROXNP ITEIl&WBRCH CSOrSMCCYVT CRRGEaTVER YVCSC 354 

IS ISO s G396 Protein Sequence 
Protein Accession lis SIP_114072 

1 11 21 31 41 51 

I I I I I I 

MEWGVT*r,KVT SLlAALALljQ RSSGAAAA^ lOSLACQBTTV PIiCKGIGYNY TYMPNQFKHD 60 

TQDEAGIiEVH QPWPIjVEIQC SPDLKFFLCS HYTPICLEDY KKPIiPPCRaV CERAKAGCAP 120 

IWQYGFAHP I]RMRCDRI<P£ QGNPDTIiCMD YHRIDLTTAA P&PPRRZ.PPP PPGBQPPBGS 180 

- OTGRPPGftRP FHRGGGBGOa G(3»AAPPAR OGOGGGKaaP PGGGIUIPCEP GOQCRAFMVS 240 

VBaKBHPLYH HVKrGQIflHC ALPCHHPFPfi QDBRAPTVPW iGLMSVIiCFV STFATVST?FL 300 

IDMERFKYPS RPXXFIkSACY LFVSVGYLVR I»VAGHEXVAC SGGAPGAGQA GGAfS3AAAGA 360 

GAAGAGAGGP QGRGEYEELG AVBQHVRYET TGPALCTWP IiIiVYFFGhlAS BIWRVHiSLT 420 

WFIiAAGMKMG NBAIAGYBQY FELAAUIiVES VXSIAVIAI»9 SVDGDPVAGI CJYVGEIQSLDN 480 

XiRGFVIjAPLV IYI.FZf3THFIi lAGFVSLERI X9VIICQQD6P TKIHKIiEICLH IItI.6I>FTVI.Y 540 

TVPAAWVAC ItFYHQmiSFR WERTHKCTCI* RDLQFOQABa PDXAVFMLICir 7MCLW6IT8 600 

CSVHVWfiGKTL ESHRSIiCTRC CHASKGAAVG GGAOATAftSG GGBPGOGGGG 6PGGGOGPGG 660 

OGGSbYSEVS TGLTWl&GTA SSVSVPEIQHP L&QV 694 

Seq ID VfOi ca97 Protein Sequence 
ProfceiA Acceaslon XP_05D625 

1 11 21 31 41 51 

) I I I I I 

MLQGPGSIxUi LFliASHGTLG SARGLFI.FGQ PDFSYKRBNC KPIPAULQLC HGIKYQNMRl* 60 

PZnjdGRBTHK BVLBQAGM71 PZ>VHKQC3IPD TKKFLCSLFA FVCIJ;DIJ>ET IQPCRSliCVQ 120 

-VKE3ftCftPVMS AF^PHPDMb EGDRFPQDHD XiCIFIAEGDH LIiPATBEAPK VCBACKNIEND 180 

DDHDIMBTLC KNDFALKIKV KBITYDIRDT K1II*BTKSKT lYKLHGVSBR JSLKRSVLNUC 240 

1373 



wo 03/042661 



PCTAJS02/36810 



DSLQCTCEEM NDINAPYLVM GQKQBGELVI TSVKRnQKOQ SS^IISRSI RKLQC 2B5 

Seg ID NO: C39B Protein Sequence 
. ^ Protein Accession (I: HP_001297.l 

1 11 21 31 41 51 

1 I I I I 1 

MSMGLEIT6T ALAVI.GWLGT IVCXIftl^PHNR VGAFIGSHII TSQSIIKBGLtt MHCWQ8TGQ 60 

MQCRVYDS&I^ ALPQDLQAAR ALIWAXIrLA ATGUiVAZiVG 2VQCTSICVQDD TAEAKITXVA 120 

lU GVLFI^L2\AIjI^ TLVPV5HSAH TIIRDPTEiPV VFEAQKREMQ ASLYVGHAAA ALQjUiGGAZiIi IBO 

CCSCPPREKK YTATKWYSA t>B&TGPGASI. 6TGYDRKDYV 220 

Seq ZD HOi Protein Sequence 

Probein Accession tt> HP_0365B1.1 

1 11 21 31 41 51 

I I I 1 I I 

HESKKDITHQ SEIiHKMKPRR HL^DDYIiEK DTCBTSMUCR PVLUfliHQTA HADHFDCPSS 60 
LQHTQEIiiFfQ WHLPIKIAAI XASLTElinii IiREVlBPIiAT SHQQYFYKIP IIiVZNKVLPM 120 
^0 VSmUMiVY liPGVIAAIVQ LHNGTKYICKP PHWLDKHMLT RKQPGIjLSFP FAVLHAIYSL ISO 
SYPMRRSYRY KLI^AYQCfV (KJI^KEDAHIE HOVWfiMBIYV SUaiV&IATIt AIiIAVTfilPS 240 
VSDSLTHREP HYIQSKLGIV SI^LLGTIHAL IPAWNKWIDI KQPVWYTPPT PMIAVEI.PIV 300 
VLIFKSILFL PCLRXiai.KI SHGWEDVTKZ mCTCXCGQIi 339 

25 Seq ID VOt C400 Protein Gequence 
Protein Accession #: NP_001766-1 

1 11 21 31 41 51 

on ' 1 1 I I I 

OKI MANCEF&PVS GDKPCCRXiSR RAQLCLOVSZ IiVIiILWVXA. ' VWPRmZQTW SQPOTTXRFP 60 

ETVLARCTKY TEIKPEMRHV DCQSVWDAFK GAPISKHPCN- ITEEDYQPLM KLGTQTVPCW 120 

KILLnSRIKD IiAHQFTQrVQa UMPTFiRm:i.L GYIjADDLiTVIC GEFHTSKIHX Q9CSDVIRKDC 190 

SSMPVSVFWK TVSRRFAB2^ CDVVHVM£^ ^8KIFDKN6 TFGSVEVHNI* QPSKVQTLSA 240 

HVIBOGEEDS RDLGQDPTIK BLBSII8KRH lOFSCKHTYR PDKFLQCVKff PEDS9CTSEZ 300 

Seq, ID HOx C401 Protein SeQoence 
Protein Accession #« XP_120513.2 

1 11 21 31 41 SI 

40 I I j 1 1 I 

KV8CTFSGPL RSTNENVKKP YALRAFMPRM SSEAAHIiGB9 RTPKPRKaSA TTEAKXFKRF 60 

FSEQ5ESNSR ZiVBELAVIHT YSDDPAPTTS PSSVQPRSF6 VKQGAPRARF GSRTPPAAAE 120 

AS8PEL6X6B AAC3Q5GARAA APRAOABBCQ FORQAAAAAA TAQTHTLPBA RTSADPAOKR 160 

BSHPRSPAPG OBGICSEGPA PRRfiMBEEMQ PAEBGPSVPK lYKQHSPYSV LiCTFPSRRPA 240 

4j LaKRYERFTZ. VBLPHQHLRT PAQPP3PASPA ASS8SSFAAV VRLOAPPRPP HRGFRAHGTI 300 

PFLLPAPGVA GTLIiPPPTSS SPP&FBFRFW HAAAPRGOTfi BTHMKR&QST LPGSDTMVSV 360 

FGLMAQRKKQ BRSIiKQFBWG XljGSWGTHFC GQDNI^EKBGQ "VAVZiIf RSEG KTAPKKSSMI 420 

U3APAQQCSa VIi8LLNCX36K TiTtt?fiWfffiQSrt I&CtfXQB6&& YHERQ£H)CHI GKSVHSQTSD 4B0 

^ NVDXEHQYMQ HHQQTSAFIiR VFTDSLQIiyii ItSGSFPTPNP SSASHYGEIiA DVDPLSTSPV 540 

JU HTLEfflSIiDS TASLCKSRKL &RBPPVK&&F P!EIPLC2Q2^LaS G2VBRPFSGAQ QSIAYRVNfiE 600 

LEDQZR&FVP EiSCBftliEIOIi TSEiGSKQLIiK KTPVYXT5KQ HDBAVHSSKK DeRXIiIiRYI.1 660 

XFVFTTDHLK YSCGLGKRKR SVQBGETGPB BRPU)PVKVT CLRGTASFRS VSPSVISFER 720 

IGOS&PRT&V QPSVF 735 

55 8eq ID MOs C402 Protein Seguenee 
Protein AccesBion BAh92562.1 

1 11 21 31 41 51 

fid ^ I I I 1 { 

OU METTVLSGX9 FEYKGMTGWB VAGDH2YTAA GASDSIDFHIL TLWPGFSI^ Q3VMADTENK 60 

EVARITFVFB TLCSVNCELY FMVGVHSRTSI TPVETfnOS&K aiCQSYTYIlE ENTTTSFTHA 120 

FQRTTPHEAS HEYTWDVAKI YSmVTNVMM GVASYCRPCA WBASDVGSSC TSCPAGYYJD leo 

SDS6TCESCP PMTILXAHQP YGVQACVPCX3 FCTKHNKIHS UCXHtDCClfSB. STDPTRTElilYH 240 

F&AUNTVTL Aa^PSFTSKQ XJOFHHFTCS LOGKQGRKKS VCXWTTDLR ZPBGB9QFSK 300 

SITAYVGQAV IIPPEVTGYK AGVeSQFVSIt ADHl>lGVTia> HTLDSITSPA ELFHLESLGl 360 

. POVIPFYHSN DVTQSGSSGR STTIRVRCSP QKTVPGSIjLIj PGTCSDC3TC35 QCNFHFLHB3 420 

AftACPLCSVA DYHAIVSfiCV AGIQKTTYVW REFKLCSGGI SLPBQKVTIC KTIDPWLKVG 400 

XGABTCTAIIi UTVXiTCyFHK KHQKItEYKYS KUVMKATLKD CDLPAADSCA XHEQEEfVEDD 540 

XiIFTSKKSLP 6KIX8FTSKQ PAPVTISLSS D9 572 

Seq ID SO I CS4D3 Protein Sequence 
Protein Accession #s HP_055iad.l 

— ^ 1 11 31 31 41 51 

75 I 1 ] I I I 

MALQGISWE IiSGLAPGRXC AMVIADFGAH VTOVDRPOSR YDVSHLGRGK RSLVLDLKQP 60 

HBPRAAASVQ AVGCIAAGAIiP PRCHSSTPAG PHDSAAGICSK AYliCQAEHm PVQESFCRIA 120 

^HDIinrLAIiS GVItSKlGRSG EITPYAPU9LV ADFAGGGLMC AliGIlMALFD AmCDROgVI 180 

DAHMVEOTAY IiSSFLNJClOK SSLMBAPRGQ NMCTOGKFFY TTYRTASGBF MAVOAIBFOF 240 

oU YBLLIHQLGIIi KSDELPNQIKS iSDHPEtCKKK FADVPAKKTEC iffiWGQIFDGT DACVTSVLTF 300 

HBWHHDBHK jSRGSFITSBB (^SPRIAPIa UUTTPAXPSS BODPFZQEaT BBIIiBBFGFS 360 

KBSIYQiUjtSD KXIBSHKVKA &I. 3B2 

Seq XD HOs C404 Protein Saqpience 
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Protein Accession #t XP_09aa32.i 

1 11 21 31 41 51 

.1 I I I I I 

J KQRHTLHAAA FLTLH8AQAF PQTDISISPA LPELPI^PSIiC PLPnHBFKSH CYIlFFFIiNKT 60 

WflEADIiYCSE FSVCRKSAKIi ASIHSMEEHV FVYDLVNSCV PQIPADVWTG LHDHHQEGOP 120 

EWTDGSSYDY SYWDGSQPDD GVHM7PEES0 CVQIHYRPTS BQLQAPBPQIi PLSISEATDV 180 

YLPEOFPABP KLMDQSWVSR KSIjKPSKSHL MBPPTPVAKH QKAKTRHRSL RQVWWPSGKA 240 

GSWKBRMKAD KGRRKRSJVPR QBGRIiROlEiR RDRAASaQOR PSQQRKQRQQ SRQERGWEBIi 300 

lU G6VSPMRGAQ AffQHGLGAGS QRGAAPEO&E £IH1QAP£L6ST WRGQRLQPQT JMjCHWALRK 360 

LPGBIAHGIJA AFVQPALQfVQ EEKNHRTRPS Q&YFTMSDPT CE>QD9KEQSli RSBGRBAEKD 420 

OPYRIiVKKKR 6PVACP8SFS LQSG6EVCLD FPVELAAGTH lARBPP 466 

^ Seq ID NO» C405 Protein Sequence 

Id Frofceiu Accession ft* Xp_054869.2 

1 11 21 31 41 51 

~ I I I I I I 

METCCPPVTL EQDIiHRKMKS WMLQTIiAFAV TSBVIiSCAET IDYYGEIC3DM ACPCESKEfit 60 

Z\J IjTVSCEEIRGI 3SLSE19PPR FPIYHTiTiDSG mjUIRLYPHE FVHYTGASIIi HLGSNVIQDI 120 

Bn?AFHG3C>R6 TiRRLHLNNNK LELUtDDTFL GLESILEyLQV DXNYISVIBP ISAFGKIiHEiLQ 180 

VLXLHDNLIiS SLPNHDFRFV PIiTHUSLRONT RltKLIjEYVOl. I^EHDKWEL QLBEHPnHCS 240 

CELTSLKDHIi DSISYSALVG DWCBTPFRIj HGRDIiDEVSK QSLCPRRlilS DYEHRPaTPIj 300 

fiTTGYI^XTP AfiVNSVATSS 8AVYKPPI1KP PKGTRQPNKP RVRPTfiftQPS KDL6YSNYGP 360 

Zd SrAYOnCSPV PLECPTACSC NLQISDI^GI^ VHCQERKIES ZKEUQ/PKFYN -pKSKXUTEtSn 420 

rAVVi{Rll>FL SAT6UXUJIL <amRlSMIQD RAFGElLllilLR KLYLNGSRIB XOiSPELFSGL 480 

QSIiQrCiFLQY NlilSElQSGT FOPVPNLQIiI. FXJIHNIiLQAM PSGVPSSLTIa TiSLHIStSOEF 540 

TSLPVfiGVLD QLKSLIQIDL HDNPVIDCTCD IVGMKLHVBQ LKVGVLVDEV ICKAPKKFAE 600 

TUMRSIKSEIt LCPDYSDVW STPTPSSIQV PART3AVTPA VRIiHSTGAPA SLGAGCSGASS S€0 

30 VPLSVIilLSL LLVFIMSVPV AAGLPVLVMK RRKKNQfiDHT STNWSDVSSF HMQYSVTOGG 720 

GOTOGHPEAH VHHROPAUK VETPAOHVYB yjEHFXiOBHC IdrPIYRSSEG N8VEDYKDLH 7B0 

EXtKVTYSSNS HLQQQQQPPP PPQQPQQQPP PQLQKiQPGBE BRSESHHIJRS PAY5VSTIBP B40 

RBDIiLSPVQD ADRPYRGIIiE PDKHCSTTPA GH8LPEXPK7 PCSPAAYTFS PNYDIAREBQ 900 

YIJIPGM3D8R LREPVLYSPP SAVEVEPHOtlT SYIjBIiKAKUT VEPDYLEVLB KQTTF8QF 958 

Seq IX> KOs C406 Protein. Sequence 
Protein Accession #s NP__000784.2 

1 XI 21 31 41 51 

40 1 1 I I I I 

MGII>SVDL1.X TliQILiPVFFS ISCiSFhKLYDS VIIiIjKHWI«L I>SRSKSTRGB WSBMIiTSBGL 60 

RCVHKSFXiU? AYKQVKDGED APNSSWEVS STBQGQUSQR STQEKIAEGa. TCSLUSFASP 120 

SRFI.WHFG8 ATlIPPFTSQIt PAFRKLVEEP SSVADFIiLVy IDBAHP8D6H AIFGDSSLSP IBO 

KVKKHOHQED RCAAAQQI>LE RF5LPPQCRV VADRKDHHAH lAYGVAFHRV CXVQRQKIAy 240 

4^ LGGKSPPSYl^f LQHSrSHHIiBK NPaKRUKJCTR IiAG 273 

8eq ID NOs C407 Protein Sequence 
Protein Acceesion H^t NP_00654Q.2 

50 X 11 21 31 41 51 

I I I I 1 I 

MQSCVSSQPS SHRAAPQDEE. GGIU3SSSSBS QKPCEKLRGL> SShSTEI^m SPrWTBCBP 60 

GCAVDLGLAR DRPLEADGQB VPLI>SSG&QA RPHLSGRKI.& LQES8QGGLA AGGSLDMZ9GR 120 

CXCPSLPYBP VSSPQSSPRZi FRRPTVESEIH VSITGHQDCV QUaQfYTZjKDB IGKGSYGWK 180 

ItAYNEKIDNTY YAHKVLSKKK LIRQAGFPIOl PPPR6TRPAP GGCIQPR6PI EQVyQSIAIIi <240 

KKLDHPSWK LVEVIiDDPNB DHIjYMVFEt*V KQOFVKEVPT UCPZaSEDQAR FYFQDLHEGI 300 

BYLHYQKIIH RDIKPSNMAT GBD<SCIKIAD F6V&NBP1Q38 DALLGHTVGI FAFKAPBSIiS 360 

filRKIFSGKA ISVWAMGVTXj YCFVFQQCZPF MDERIKCE^ KIKSQAIiBFP DQiPDIAEDLK 420 

^ DXiZTRMLDKl!! PESRIWPBI KLHPHVTRHG ASPXtPSEDBN CTLVSVTEEE VEHSVKHZPS 480 

OU IiATVIDVKIW ISXRSFGSitPF SGSRRSBRSIi SKSfSSfLVTSK PTRECESLSH ItKEARQElRQP 540 

PGHRPAPRGG G6SA&VRGSP CVS5CNAPAP G8PAE2HHPCR PEEAHBPE 568 

Seq ID ISO s C408 ProteJln. Sequence 
Protein Acoessloa 4: »IP_061116.2 

.1 11 21 31 41 51 

I I I ) I i 

MGLGlf KBKG LILCLWGKFC R»FQRHESHA OBSJXB&^JJd QKRIWE^LL LAAKDNDVQA 60 

UaXUiKYEDC KVHQHGAITOK TALHIAALYD NLEAAMVLMB AAPELVFEPM TSKIiYEGQTA 120 

/U laHAWMCfSlM WCVHAIiLARR ASVSARATGT APRRSPCWIal YPGBHPI.SPA ACVNSmiVR IBO 

IiLXEHGADXR AQD8LGNTVL EZXiXIiQPNKX FAOQMKIILLL fiUSRHGDHLQ PLDLVPNHQG 240 

liTEFKLAfiVS GarVMPOHIiM QKRKHTQWTY GPLTSTLYDIi TBIDS9GDBQ SIiIiBLIlTTJIC 300 

KRBARQIIiDQ TPVKELVSLK HKRYQHPtfFC MLiGAlYLLYl ICFTMCCIYR PLICPRTHHRT 360 

SPRDHTUCQQ KLIOEAYMTP KDDIRLVGEL VTVIGAIIIIi LVBVPDJPRM OVTRPFQQTI 420 

/5 IXH5PEHVI.H TYAPMVliVTM VMRLISAStSB WPMSPAIiVL GWCMVmrFAR GFQMLGPFTI 480 

MIQKMIPGDL MHPCKIMAW ILGPAfiAPYI IFQTBDPEEL GHFyPYPMAL FSTFBLFLTZ 540 

IDGPAXIYNVD XiFFMY&lTYA APAIXATUJ^ IjNLLIAMKGD THigBVABBRD ELGIRAQIVAT 600 

TVMLBRKLPR CbWPRSQIOG REYGU3DRNF LRVEDRQDItir RQRIQRYAQA FIflROSBDU) 660 

KDSVEEOtELQ CPFSBHLfiliP KP9VSRSTSR SSAiniESIiRO GTDRRDiLRGI XMafiLBDGES 720 

OU ^EYQI 725 

Seq ID NO; C409 Protein SeqaencB 
Protein Acceesion ft: »P_06e7lo.x 
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65 
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1 11 2X 31 41 Bl 

t I I I I I 

MQICVTLGLIiV FIjAGFPVLDA MDI.EDKHSPF YYTHHSLQVa GlalCAOVIaCA KGIZIVHSEH 60 

RSSGEQAGRG WGSPPLTTQL SPTOAKCSOC F6QK6GHHPG BTPPLITPGS AQ9 113 

SoQ LD SOs C410 Protein Sequence 
Protein Accession lis NP 005962,1 



X 11 21 31 41 51 

10 1 I I ] I I 

MQKV^irLGliLV PIAGFFVIiDA. NDLBDKHSPF YXDHKSIjQVG QZircnBVZ.CA M6IIIVM3AK «0 
CKCKFGOKSG BHPGETPPI>X TPQSAQS S7 



Seq XI> NO; C4X1 P3rotein Sequence 
Protein Accessioa MP_0D4952.1 

1 11 21 31 41 51 

I I 1 I I 1 

HIiSKVLFVIili GlLXilLOfiRV BGPQTESKHS ASSKDWYGP QPQPLBHaiili SEBTlCfiTETB 60 

rEGSRVaXXiPS ASRIIiNTILS SnTDHK^ItPGI GBKPTWTVE lAVNfiLGPLS IlfDMEYTIDI 120 

IFSQTHYDER IiCYNDTFESIi VliHGNWSQL HIFDTFFRNS KRTHEKBTTH FHQMVUIYKD 180 

GRViYTIRMT IDAGCSLHMI* RFPMDSHSO* TiSPgSFSYPB NH^IYKSTBUF KLEINEKNSW 240 

KI.FQFDFTQV ENKTBIITTP VCasFHVHTIF FNVSRRFGYV AFQHYVPSSV TTFMIjSNV&FH 300 

IKIE8APART SLGITSVUTK TTIiOTF9Riai FPRVSYIXiUi DFYIAICFVF CFCAUflFAV 360 

IMFtiXYKQFX. ABASFKUZHP RlNSRAKAST SARSRACKRQ BQEAFVCQlV IXBG8DGEBR 420 

P8C8AQQPP8 FGSPBGPRSli CSKIACCEHC XRPKICTFCMV PDCBG8THQQ GREiCZHVZEXi 4&0 

BHYSKWEPV TrFFPFNVI.yK LVCLKb 500 



deq ZD hOs C412 Protein Seijuence 
3U Protein Accessiw 41 > HP_DeBBl9.1 



X 11 21 31 41 J&l 
11)1 I I 

VSEYTXDIIFS QTmrDBKLCY NDTFESLVUT fiHWSQliVlP STFFREISKRT HBHBITHPHO 60 

Hnmi-XKE>GKV IiYTlRMTlEA GCSLSlMLItFP MDSHSCPIiSF SSFSYPEETEM lYKHENFKLE 120 

INBKNBWKLF QLDFTGVSNK TBIITTFVOD FMVMTIFfHV SRBFGYVAFQ KyVPSSVTTM IGO 

Z>^WVSFWIKT fiSAPAHTfiLG Il^eVLTKTTL GTFSRKKFPH V9Y1TAIJ>F? lAICFVFCFC 240 

ALLBFAVLNP LIYHQTKAHA SPKIillHPRIlil SSAHARTSAR SRACARQIXQE AFVOUVTTB 300 

GSOtaBBBPSC 9AQQPF&PGS PEGPRSLCSK LACCSHCKaF lOnfFCSCVPDC &GSXHQQARI* 360 

GXRVYR£J>NY 8RWFPVTFP FFSIVI^yVlUVC 393 

6ec[ ID NO: C413 Protein Sequence 
Protein Acceaalon #s HF_0 68822.1 



1 II 2X 31 41 51 

1 1 I I I I 

MEYTIDIIFS CrWKSKRTHK HEITMPNQHV RIYKDQKVIjY TIRMTinAGC SliHHLUPPHD 60 

SBSCFLSFaS FSYPEHEHIY KH&HFXL&IN EKNSWKLFQF DFT0VSHSTB IITTPVSDFM 12 D 

VHTITFNVSR RFGYVAFaiSY VPSSVTlHbS WVSFHZKTS& APARTSESXT SVIjTHTTLOT ISO 

5U F&RXNFPB:V& YITAXJ^FYIA XCFVFCFCAL LEEAVZ^IiZ YMQTXAHA&P XIABPRIIISR 240 

AHKRTRARSa ACAROKQBAF VCQTVTTEGS DGEERPSCGA QQPPSPGSPH QPRSLCSKLA 300 

OCBWGKRFKK YFCKVPDCBQ &THQQGRLCI HVYRLDHYSR WFFVTFFFF HVLYNCVCZiir 360 

I> 361 

55 Seq ZD NO: C4X4 Protein jSequence 
Protein Accession It: HP^OSS 030.1 

1 11 21 31 41 51 

OU MSYTIDXXFS QTHYDSRLCy NDTFBS&VIUI GHWSQLWIP DTFFSSISKRT HSHEITHENQ 60 

KVRXYKDGKV 1.YTIBMTXDA GCSLHHLSFP MDSHSGPI^F SSFSYPEHEK XYSSISXtSVO^S 120 

INEKMSWKLF QPDPTGVaNK TKIITTPVtSD FWUMTJFEHV SHUFGYVAFQ KYVPSSVTTM IBO 

£>SHVSFW1KT ESAPARTSliG IT8VLTMTTI. GTFSRKNFPR VSYXTAEiDFY lAICFVFCFC 240 

AM.SPAVUIP bXYNQTEABA SPKUIHFRZH SRAHARTRAR GOACARQHQH AFVODZVITE 300 

(SSDaBE&P&C SAQQPP8P6S PS6PRSLCSK lACGEMCKRF KKYFCSnrPDC BS&TtlQQQRIa 360 

dHVTOT.DWY SRWPPVTPF PENVLYWIrVC 1Mb 393 



Seq ID NOs C415 Protein Sequence 
Protein AcceseioiiL #« hp 068591.1 



1 11 21 31 41 51 

I I I 1 I I 

MPAVS6FGPI. FCTiTiTiTiTiTinF HSFBTGCPPCt RRFSYKIfSFiC GPRLAIiPGAG IPPHSHHQDA 60 

ZLQLEEVSZ.T PSMRMR&QAV 7SRASVPFBA VIEVBVQHRVT GU3SIU3AI1GM AVMYTRlSRGH 120 

75 VGSVLGGLAS VDGiaiFFDS PASDTQDSPA ZRVLASDOHI FSBQPGDGAS QQIXSSCBHDF 180 

RNRPHPFRAR ITYWGQRLIM SLNTSGUTPSD PGEFCVDV6P liLXiVPOGPFG V&AAT6TLAD 240 

X9HDVLSFIiTF SLSKPSPEIVP FQPFI^MQQI. SIiARQLBQIjH ARLGLGTRBD VTPKSDSEAQ 3 DO 

GEQBRIiFDIiS £II£RHRRIX* QAIiRGLSKQIj AQABRQWKIQQ I^PPGQARSD GGKALDASCQ 360 

ZPSTPGRaOH IjSMSIMRDSA XVGACiLHGQW TIiLQAZiQEHB DAAVRMAAEA QVSYZiFVBZE 420 

oU HHFIiBLDaZD GLLQKELRGP AKAAAEAPRP PGQPPRASSC LQPBZFXiFYIi XilQOTVGFFGY 480 

VHFRQBUnCS I^ECLSTOSXi PU3PAPHTPR AIiSXIiElRQPli PAGKPA 526 



Seq ID MOs C416 Pirotein Sequence 
Protein Acceesion tt; XP 1X7036.1 
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MERRTR0AU? SRRPPPPl^A ljRHI>CrrGLQA AGMAifPOTLW RHTCQGRAXA ABGPHGliFRP £0 
J HRCPREAGQA FVOPSPETQCJ VAHVCSRARV SVDEBffiPGQG AYAMavrPAH KGCBBHSGRT 120 
VRGSVSVKRP BQAAPETGRG PAVARGS6D6 NBGGW? 156 

Seq ID KO7 C417 Protein Sequence 
Protein Accession #: XP^167803,2 

1 11 21 31 41 SI 
1 I I 1 I ] 

MPGKGQRECTA TZfifKPGGliPGA PGVGIG6HCD YVLECKCPllC WKIKTHHHKK KNTFAAKRNEB 60 
KLKKICKKQEK KNHTKFFHHT YPLSQQDPLP AKSIfFOGNGP CPIiHQGLF 108 

Seg ID NOs C41B Protein Secjuence 
Protein Accession HP_07 9056.1 

^rk ^ 21 31 41 51 

20 1 I I ] I I 

HPSOjVBRYSM PRHEVyVIjI>I RNIFUCISII QILCYYHLNT VAIiSQBBCHB TlilGQDIYKIi CO 

IrUfDFVFSIiV HSFI4GBFLRR IIGMQLITSI* GLQEFDIARH VlrELlYAQTL VWIGIFFCPL 120 

I>P3PIQI4IMIjP iMFYfiKHlSL MMNFQPPSKA WR&SQHMTFF IFLLFFPSFT <3VLCTLA1T1 IBO 

WRUCPSADCG PPRGIiPLFIH SIYSWIDTLS TRPGYItWVW lYIOaLIGSVH FPFILrFIiXVI. 240 

ZD IITYLYWQIT EGRKIHUSU/ UEQlZNSGKD KHPIilBKIiZK LQDHEKKfiHP 88XiVLER2lBV 3DD 

SQQQFtiHI^E HDQ91J>M?SR RSVQECaTPRA 330 

Seq ZD HOi C419 Protein Sequence 
Protein Accessioa tt: Eos sequence 

1 11 21 31 41 51 

i I I I I I 

MLSDreVWBI IIQVBKVSSG VQSHPefiNQl FQEKWLliDSS iKMVLSlfiDt DVIDSQTVBK 60 

11HDQKC3NQVI. RF3TSLNE3M SQTLH3I.KCM GIDTPGSSHE TVOGOKLIAS UCPMrSRWRI 120 

KAIKNQ£>R!CM EEKHNIiSKXV D^^&XfiKQTHR IliC^dlCClQC! USSlfiRAYlCR SKN&LS&IUI 180 

SISIiWaKTLK IIGGKFGTSV IiSYFNFIiRWIj LKENIFSPIL NFSPIIIPQF TVRKKKTIiQF 240 

TGLEFFTGVG YPRDTVHYYQ FYTNSTlCHG NSOASYKMQL AYIPTIGACL TTCFFSLLFS 300 

MAKjrPRmjFI NPHIYSGGIT KTrlFCWDFTV THEKftVKLKQ KNLSTBIREN LSEUiQBNSK 360 

LTFttQLLTRF GAYMVAWVS TQVAIACCAA VYYIABmU! FliEETHSHFGA VIJUU?FWSC 420 

*IU INLAVPCIYS MFRI,VERYBM PRHEVYVLLI RHIFUCISII GILCTYWUIT VftLSOBBCSIE 480 

IIiIOQplYBIi LLHDFVF2LV KSFLGEPE»RIL IZGMQLIT3Z. QLQBFDIASH VI.ELIYAOTI1 540 

VWXGXFFCFL MFIQMIMIiF IMPYSKNIfiL MMNPQPPSKA V?RASQMHTPF IFLLPFPSFT 600 

GVLGnAXTI HRIiKPSADGQ PFR<3IiPIi7rH SXYSniDTIiS TRPGYIiHWIf lYRNIilGSVH 660 

. F7FILT£.IVI» IITYI>YNQIT EGRKIMIRIiL HEQ1IHB6KD XMFLISKLIK LQDMEKKAigP 720 

4D SSIATIjBRKBV BQQGFLHLGE EDGSLDLRSR RSVQBGEIPRA 760 

Seq ID NO: C420 Protaln Sequence 
ProtetXL Acceeeloa tt: NP_00224l.l 

50 J 11 21 31 «1 SI 

I 1 I I I I 

WaaPLVI ^aLS ALRRRKRIiIS QEKSIAGRALi VIiAGrGtOLH VliltAEMIjWFG GCSWALYLFL 60 

VKCTiaiSTF ZfULIZLXVAFH AKBVQriFMTD MGLIIDWRVAJ* TGRQAAQTVIi BIjWCGIiRPA iZO 

_ _ PVBGPPCVQD LGAPLTSPQP HPGFLGQGEA LLSLftWtiLRli YLVPRAVUA fiGVI^UtASYR ISO 

DD SIGMJ90V31F RHHFVAKLYM NTHPGRLTiLG IjTUJIiWLTTA WVIiSVAESQA VHATGHZiSDr 240 

XiWl^IPITFIiT IGYGDWPGT KWGKIVCLCr GVMGVCCTAL LVAWARKLB FKKAEKHVHM 300 

FMMDIQYTKE MKESAAaVIiQ EAWMFYKHTR RKESHRRRRH ORKIOiRAINA FRQVRIiRHRK 360 

I1RBQVN8KVD ISKMHMILYD LQQMLSS6HH ALEKQlDTIiA GXLDAL7HLL STALGPSQLP 420 

EPeQQSK 427 

Seq ID KOi C421 Protein Sequence 
Troteln Accession 8: KP_079533.1 

^ 1^ 21 31 41 51 

, MGGHQSDEDD EAYGBPVKYD PSFRGPIKNR BCTDVICCVIj FLLPILGYIV VGIVAWLTGD 60 

PRQVI1YPBH8 OrGAYGGMOEN iCDKPYIiYFBF IFSCIL9SMI ISVAENGHiQC PTPQVCVSSC 120 

&EDPttTVGKM" EFSQTVGBVF YTKNRNPCIjP GVPWMMTVIT SLQQEXjCPGF UCiPSAPALGR 180 

CFPtmilTPP AZaPOlTKTDTT IQQGlSaijID SLHftSDX^VK ZFEDFAQSfnr HZIiVAXiSVAZi 240 

/U VliSLIiFIIiLL SLVAGPLVZiV UlUSVUOVIA. YGlYYOfBEY KV£iIEDKS2£Z fiQLGFTIllLfi 30O 

AYQSVQBTHZi AALIVrAVLB AZLLLVLIFIt RQRIRIAIAL L1CBASKAVGQ HKSTHFYPLV 360 

TFVLLIiICIA YffAMTAIsYPI* PTQPATLQYV IiHASNISSFG CEKVPINTSG NFTAHItVNGS 420 

C3EKSLHCVPQG YSSKGLIQRS VEHLQIYGVI, GLFMTLHWVIi ALGQCVIiAGA FASPYWAFHK 480 

PC3P1PTFPLI SAFIRTLRYH TGSIAFGAIiI LmUVQlAKVl liBYlDHKLRQ VOHFVAACZM 540 

/D CCFKOCXiWCIf EKFIKPLNRN AYIMIAIYGK NFCVSAXfilAF MIiLMRHrVRV WEiDKVTDLIi 600 

IiPFGXIilAVVG GVGVZiSFFFF SGRTPGI/SRD FKSPHUTYYH XiPIHTSZIiGA YVIASGFFSV 660 

FGMCVDTZiFIt CFTiRnT,ERMK GSLDRPYYMS KSLI1KIL6KK HBAPPDNKKR KK 712 

Q/\ ID C422 jProteln Sequence 

oU Frobeizi Accession ttt ]|P_057364.1 

3- 11 21 31 41 51 

\ \ \ \ \ \ 

MGSU8GQA6R HlYKSIiADDG PFOSVEPPKR PTSRLIHHSM AMFGREFCYA VBAAYVTPVIi 60 
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LSVGLPSSLY SIVWFI>SPIIj GFLLQFWGS ASDECRSRWG RRKPYILTLG VMMIjVGMT^Y 12 0 

UiTGATWAAL IflNPKRKLVW AISVTMXGW LFDFAADFID GPIKAYUDV CSHQDKEKQL 180 

HYHALFl^FG GALGYU^GAI DWAHIiELGKL LGTEFQVMFP PSALVIjTLCF TVHLC5IGE& 240 

PIiTEVAKGTP PQQTPQDPPI> SSDatfYEYGS ISKVKHGYVK PELAMQGAKH KNHKBQ7KRA 300 

MTLKSIiHAIj VNMPPHYHYIi CISHLIOHTA FLSNMbPFTD FlflSQXVYIIiGD FYSaHNS^F 360 

IiTYERGVEVG CWGFCIMSVF SSLYSYFQKV I.VSYIGLRGI» TCFTGYl^LPGL QTGFIGLFPN 420 

VYSTLVliCSL PGVMSSTLYT VPFKLITEVH REEEKERQQA PGGDPnMSVR GKOHDO^TIiT 480 

CMVQLAQILV GGGIiGPLVNT AGTVWWIT ASAVftlilQCC FUM.FVRYVD S30 

Beq IS tJO: C423 PrptelTi Sequence 
Protein Accession HPj0oa2£4.1 

1 11 21 31 41 51 

1 I 1 t 1 I 

MBGFGGVOGR GTAGFAAKOV HKGEU^EEGPV LGAAERGFKV STQSRfiRVFB GPGGGCSIiRnT 6D 

PGRGTGRQKG AW6PKABDGV RRKTXiGMPaG SRKOVItAPOG PAGSHGARGG RRKDGP&RTIH 12 0 

RGGATAAARH HVPPAPGOPF QPRAFA6SIR VPARAQGAVB PTSAAAVA^Oj ARPAGGAJ:>PT 180 

AGAQOAGPAR GaSGBG&EWA RRGKGRPGPY QfiPIiGPAVAE GQEIiKDKSSD RYPIMGFQAI, 240 

VLTALIiVGLG HSAGIiPrX3Alj PEPfiLLPLAFV ATLTAFIFSI. FLYMKfiQVAP VSAIiAPGGNS 300 

GNPXYl^FFIiG REUVFRICFF DFKYFCELRP GIiIQWVZiINZa AMiMKEABXJR GSPSLAMSnjV 3 SO 

HISFQiU.YVGD AXAHEEAVUT THDI^IBDaFS FMIiAFGI3MMI VPPTYSLOAQ PT.TfHHPQPLq 420 

LPHASVZCU NATGYYIEKG A3I8QXNTFRK HPSDPRVAGIi ETISTAT6RK liliVSGWHGMV 480 

RHPHVLGDLI MAIANSLP03 VSHZiEiPYFYL LYFTALDVHR SUIDESSaCa STAHP6R9TA 540 

GVCCTASCPT fiTBAAPPFQV GHVPTHPPA& PapQASTHLG LKGLHPTQP 569 

Seq ZD siOs C424 Protein Sequence 
Protein Accession IVF_056S35.1 

1 11 21 31 41 51 

I 1 t 1 1 I 

MGRLIJtAKRIi PPLIjSPUU^ IjVGG&FIjGAC VAlQSDEPaPB CaiTSTSttljDj:* IiIiPTOI>EPIil3 60 

SBEPSEIHGIL GMSLSPiPGSG FPSBE37BBSR XLQPPQYFNE EEEEira>SSL DLGPTADYVF 120 

PDLTBKAG8I EDTSQAQELP ISLPSFIiPKMST LVBPPWHMPP REEEEEBSBB EEREKEEVEK 180 

QEESBESELI. FVSlGSQEEAK VQfVSJyF3W3 SSCTTPGATSCS RHEDSCa^O^^S fiGVEVESSMG 240 

PSIaLIiPSVTP TTVTPGDQDS TBQEAEATVL PAAGI.GVEFS APQSASEEftT AGAAGLSCSOH 300 

EEVPALPSFP QTllnPSGASH PDEDPLGSRT SflS&PLI^PGD MELTPSSATL GQHDLMQQLL 360 

BOQ2^AEftQSR IFIVDSTQVli; KDH9NUVS8fir yXXLNKTE»I DCEVFBQERG PQIiXiAIiVHEIV 420 

&PRIIQS6HHQ AHHI6L8KP8 EKBQHIJ^MTI^ VQBQGWPTQ DVI>SMCiGDZR RSliEElGIQN 480 

YSTTSSOQAR ABQfVRSDYGr l^FWIiWlGA ICIIIIAIX3;Ij DYNCtfQRRIiP KIiKHVSHGEE 540 

XiRFVEKIOCHD HPTIiDVA&DS QS&HQBKHPS I^NGOGALtlGP GSHGAIiMOGK RDPEDfiirVFE 600 

EDTHIj £05 

Se^ 3D IflOs C425 Protein Sequence 
Protein AcceeBlcn #i KP_0011B8.1 

1 11 21 31 41 51 

i I I I I I 

KSEVKPLSRD XliKBTLI^YBQ lOiBPPXMEVL GMIDSEEDTJ) PHSDFD6I1EC MBGSDALAIiR 60 

IACIGDEMDV SLRAPRLAQL SEVAHUSLGIi AFIYCQTEDI RDVIiRSVHDO FnldCBailMR 120 

FHRSP£IPS8W VSCEQVIiIiAIt IJ.IiIiA£i£iIjP£* LSGGUILZiLK 160 

Seq ID HOc C426 Protein Sequence 
Protein Accession AAF76235.1 

1 11 21 31 41 51 

I 1 I i I I 

KATFLPPPfiP WSjB3JURLLL BGLVIiGAAIiR GIUUU3HFDVA ACPGSUlCAXi KRRARCPPGA 60 

IIAOGPCDQPF QEDOQSLCVP RMREPP0G6R PQPRLEDBXD FLAQBLAKKE 80Q9TPPZ>PK 120 

DRORliPf^Ar L6FSARGQGI1 Ei:.GriPSTF<7r FTFTPHTSLG SPVSSDPVEM SFLKPRGGQG 180 

DGLAXiVXjXIA FCVABAAAXjS VASLCHCRLQ RBXRIiTQKAD 1CATAKAPG5P AAPRISPGDQ 240 

RLAQ8AEMVH YQHQROOKLC X^BREXEPPKE U3IAS6DBEN EDQDFTVyEC P6Z1APTGEHE 300 

VRSPIiFDHKA Z1&APLPAFG8 PPiOiP 32S 

Seq ID NO: C427 Protein Sequence 
Protein AcceBeicra #= 1IP_004436.1 

1 11 21 31 41 51 

I [ I I I I 

HVCSLWVLLIi VSSVLALEEV l.UJTTQBrSE IGWLTYPPGG W>EVgVIinDQ RRLTRTPBAC 60 

HVAQAPPGTG QDNWIiQTHFV ERflGAQ&AHI RLHFSVRACS SLOVSGGrcR ETFTIiYYRaA 120 

EBPDSPDSV& SnUUCRMTKV DTIMQBSFP SSSSSfiSSSS SAAWAVOPHO AlSQ&AGLQLSr 180 

VKERSFCTLT QRGFYVAFQD TGAC!LAI«VAV KLFSYTCPAV IiRfiFASFPST QASGAGGASIj 240 

VAAVGTCVAH ABPBEDGVGG QAGGSPFRLH CSnasGKHMVA VGGCRODPGY QPAR6DKAGQ 300 

ACPRQLYK8S AGNAPC8PCP ARSHAPNPAA PVCPCX*BOFY RASSDPPBAP CTGPPSAPQB 360 

LWPEVQGSAIi HZJIHRLPREL G6RGDLLFHV VCKECBGKQB PA8GGGGTCE HCRDSVHFDP 420 

KORGX/rBSfiV LVGGLRAHVP YXIjEVQAVNG V8BX18PDPPQ AAAIHVfiTSH EVFSAVFWH 480 

QVSBA8NSIT V9HPQPDQTN GHXI^DYQLRY YDQAEDE&HS FTLTS6TVTA TVTQLSPGHX 540 

YGFOVRARTA AGHGPYOGKV YFQ!TI*PQGED fiSQItPERIiSIi VlOSlIiGAI*A FLUUAAITVI. 600 

AWFQRKEHQ TOYTEQLQQY SSPGLGVKYY IDPSTYEDPC QAXRBUVREV DPAYIKXEEV 660 

IGraSPGBVR QGRI.QPRGRR EQTVAIQALW AOGAESI^QMT FIjGRAAVLGQ FOHPNlUUiE 720 

C3VVTKSRPLH VUTEFMBLGP U>SFLRQRBa QFSSLOLVAM QRGVAAaMQfY LSSPAPVHRg 780 

IjSAHSrVLVHfi HLVCKVARXiG HSPQQPSCLI. RWAAPEVXAH QKHTTS£DVW SFGIE^MWEVM 840 

SYOERPYliW SBQEVIiKklE QBFRXfPPPG CPPGIjBIiLML DTWQKDRARR PHFDQLVA2)kF 300 

DXHlRKPDlIa QAGGnPQBRP SQJUiUTFVAl* DFPCLDSPQA. TOtGAlGLBCY QDNFSKFOLC »60 
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TFSDVAQLSli EUl^PALGITL AGHQKKLLHH IQX^LQQHIiRQ QGSVEV IQDfi 

Seq ID NO: C428 Protein Sequence 
g Probein Acce&sioa #: XP_043340.2 

1 11 21 31 41 51 

I 1 I i I 1 

MPFDFRRPDI YRKVPKDI.TO PTYTOAIISI CCCiiPILPLF I,SEDTGFITT EWNELYVEO 60 

^ ^ POKDSCSSKID VSUTI&UWli UCSLV6I/DIQ DEM^SKEVGH IDNSMKlPXiN N6AGCRFBGQ IZD 

XU F9INKVPGNF KVSTHSATAQ PQNFE3iMTHVI HEOiSFBDXliQ VQNZaGAFNA liQGADRIiTSlil 180 

PIiASEDYILK IVETVYEDK5 GK&BLYSHQrr VANKEYVAYS HT6RZZPAIH FRYDIiSPITV 240 

KTTBRRQPLY &FITTICAIX GGTFTVAGJIi DSCZFTASEA. WKKIQLGKMH 250 

^ _ Seq ID WO; C429 Protein Sequence 
ID Psroteln Accession #t NP_002142.1 

1 11 3i 31 41 51 

I I T I 1 I 

^ HAQKBQORTV PCC9RPKVAA MPACKCLLLLT AIGAASWAIV AVLLRSDQBP LYPVQVSSAD 60 

ZiJ ARIiHVFDKTE GflWRULJCSSR SMTARVAGLGC EBMGFL1UVLT HSBLDVRTAQ AMGTSGFFCV 120 

DBGRLPHTQR liLEVlSVCDC PHGRFIAAIC QDCGRRKLPV DRIVGGRDTS LGRWPWQVai, 180 

RYDQABIiCOG SLLSCSDWVIiT AMCFPERISR VliSRWRVPAG AVAQASPHGli QLGVQAWYK 240 

GGYIaPPRSFN SEBMSHDIAn VBLSSPIiPliT BYIQPVdrPA AGQAXaVPGKX CTVTGHGnrQ 300 

YYGQOAGVLQ EARVSIISND VCNGADFXGN QIKPKHFCAG YPEGGZDRCQ GDSGGPFVCE 3 60 

DSISRTPRWR I^CGrVSHOTG CAIAQKPavY TKVSDFREHI FQAXRTEISEA SOHVTQL 417 

Seq ID NO: C430 Protein Sequence 
Protein Accession #: BAA92S6Z.1 

30 1 11 21 31 U 51 

I I I 1 I I 

METTVLSGIN FEYKGMTGWK VAGDHIYTAA GASONDFHIL HjWPGFRPP QSVMADTENK 60 

BVARITFVFE TI*CSVNCB1.Y PHVOVNERTBI TPVETMKGSK GKQSYTYIIB EHTTTfiSTWA 120 

- - SQRTTFHBftS KKSmiDVAKI YSINVTNVMN GVASYCBPCA IiEASDVGSGC TSCPAGYYID IBO 

3D RDSGTCHSCP PNTIUCftHQP YQVOACVPCG PGnSUKKIHS LCYNDCTFSR KTPTRTFNYN 240 

FSALAHTVlli AGGPSPTSKG LKYFEHPTItS IiCX3NQaRKM3 VCTDNVTOLR IPEGESGFSK 3 00 

8ITAYVCQAV IIPPBVTOYK AOVSSOFVfil. ADRLI6VTTD PtTIDGITSPA ELPHEjESLGI 360 

PDVZFFXSSN DVTQSCSSGR filTIRVRCSP QmrvPGSIrLIi PGTCBDGTCD GCNFHFU^S 420 

AAACFbCSVA DYHAIVSSCT AQIQXXTyVH RBPKLCSG6I SLPBQmmC KTIDFEOiecVG 480 

4U ISAlCTICrAlIt LTVl<TClCFffK KNQKLEYKYS }alV^fI19ATItra) CElIiFAADSCA IMEGEDVEDD 540 

LIFTSKKSLF GKIKSFTSKQ PAPVTiaiiSE DS S72 

Seq ID KO: C431 Protein Sequence 
Protein Accession NP 0O4ASS.1 

45 

1 11 21 31 41 51 

I I I I 1 I 

KPQQBLRTVN GSQMLIjVIjLV ItSWLPHGGAIj SU^EASRASF PGPSEErHSED SRFSBIiRKRY 60 

^ BDIiUTRlAAlt Q&HBDGNriDL VPAPAVRILT PEVRIiaSOOH UHSlSRAAIi PBGLPBASRZi 120 

DU HRALFRXiSPT ASRSnPVTRP IiEU2QI>SIiARP QthStMjEOXLB PPF8QSDQI>I> AESSSARPQL IBO 

BtiRIiIlPQAAR GRSRARASNG DDCPI.GPGRC CRIjIirVRASEi BDLGHAlJffVL SPREVQVTMC 240 

IGACPSQFRA A19MB&QI1CIS imiiKPDTBP APCCVPASXET PHVI<IQ1CTDT QVSIiQTyDDL 300 

LAKDCECX 308 

55 Seq ID lK>i C433 Protela Sequence 
Pzotein AccesBlon Ut iap_4430dD,l 

1 11 21 31 41 51 

' ^ I i ^ I 

OU HBDPSGARBP SARPBBRDPa SRPHPDQ6RT HDRFRDRPGO PRR3CR99Diaff RRBQC3DSDPK 60 

RDQBBDGNRD Rl^RDRERERS RSRDPDRGPR SOTHRDAGFR A6EHGV?IEKP RQSRTRDOAR 120 

ai/I^Q2AAAPP GPAPWBAPEP PQPQRKGDFG RRRPESEPPS ERYItPSTFRP GRESVEYYQS 180 

EABSZaZaECHK CKVIiCTGBAC CQMDEVIiIJaij I.II.AC&&V5Y 8STGGYT6IT SXiGGXYYYQF 240 

^ GQAYSGOroOA DGaCAQQ|Z23V QFVQZiKLFMV TVAHACSGKb TAliCKZUPVAH GVUIVFHHCP 300 

OD tiUiVTBGLia) MCtlAGGYIPA ZiYFyFHYZjSA AIEO&PVCXSR QALYQSKGTS OFQCSEBQAD 360 

iGAaiFAALG IWFALGAVI. AIKGYRKVRK IiKERPABHFE F 401 

6eq ID 270: C43S Protein Sequence 
Protein Accession #: Eos seqaence 

1 11 21 31 41 SI 

1 1 i I 1 { 

MGAAGRQDFX# FKAMLTI&Hlj TLTCFPGATS TVAAGCH3Q9 PHK}PHHPGEI DQDHHVHIGQ 60 

^ OSTLLIiTSSA TVySZfiZSEG ORXiVZiamDE SIVXATRHIIa IDNGSBIiEftG SAI.CPFQGBF 120 

/O TULaPSRADB GIQFDPYYGI. KYIGVGBClGGA CBLaBQKKLS HTFLHKKIJIP GGMAEGGYFF 180 

ERSWGKRGVI VHVIDPKSGT VIHSDHFDTY R9KKB&ERLV QYUSAVPDGR XW3VAWDEQ 340 

SRNXiDDHARK AMTKLG&KHF LEI^PRHPHS FLTVKGHP9S SVBDHXEZBQ HR6SAAAR7F 300 

KI^FQlXBHGBY PNVSZ«5SEHV QDVBWTEWFD HDXVfiQ!XKOG EKI&DLNKAH FQKICNRPID 360 

„^ lOATTODOVN Z^BWYKKG QDYRFACYDR 6RACR3XSVR PIiCOKFVSPK MrVl ' IDIg V H 420 

OU ETTtUillrEDNV QSWKPGimiV lA^TDYSMYQ AEBFQVLPCR SCSWHOVXVA QKPHYUIIOB 480 

EIDGVDMRAE VGLIiSHHIIV MGEMKD3CCYP YRKHIC3»PFD FDTFGGHIKF ALGPKAAHErB 540 

GrSLlCHKGQQ LVGOYPIHFH UUSDVDEBiGG YDPFTYIRDL eiBHTFSRCV TVKGS23GZ^r 600 

KDWGYNSL6 HCFFTEDGPB BRMTPDECLG LLVKSOTLLP 9DRD9KMCKM ITKDSYPGYI 660 

PKPZiaDCNAV 9TFnHRNPHK HIjIlKSkAAGS EBTSFtfPXFH KVPTGPSVGH YSPOYSBHIP 720 
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liGKPYNNRAH SNYRAGKIID NGVKTTEASA KDKRPFLSII SARYSPHQDA DPUCPREPAX 160 
IRHFIAYKNQ DHGAWLR6GD VWlO^SCHFltG BAQEGFLLTG MKAGGILLGG DEAASGMAQG 840 
FSPFCRCLIfK LVTTG9PFAH VSLAHS U66 
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It is understood that the examples described above ia no way serve to limit the true 
scope of this invention, but rather are presented for illustrajdve pmposes. All publications, 
5 sequences of accession numbers^ and patent applications cited in this specification are herein 
incorporated by reference as if each individual pubUcation, accession number, or patent 
application were specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMBD IS: 



1 1 , A method for determining the presence or absence of a pathological cell in a 

2 patient, said method comprising detecting a nucleic acid comprising a sequence at least 80% 

3 identical to a sequence as described in Tables 2A-80 in a biological sample &om said patient, 

4 thereby deterniining the presence or absence of said pathological cell. 

1 2. The method of Claim 1 , wherein: 

2 a) s£dd pathology is described in Table 1, including a cancer; and/or 

3 b) said biological sample comprises isolated nucleic acids. 

1 3. The method of Claim 1 , wherein said biological sample is tissue from an organ 

2 which is affected by said pathology of Table 1, including a cancer. 

1 4. The method of Claim 2, wherein said nucleic acids are mKNA 

1 5, The method of Claim 2: 

2 a) fiirther comprising a step of amplifying nucleic acids before said step of detecting 

3 said nucleic acid; or 

4 b) wh^e said detecting is of a protein encoded by said nucleic acid. 

1 6. The method of Claim 1, wherein said nucleic acid comprises a sequence as 

2 described in Tables 2A-80. 

1 7- The method of Claim 2, wherein: 

2 a) said detecting step is carried out by: 

3 i) using a labeled nucleic acid probe; 

4 ii) utilizing a biochip comprising a sequence at least 80% identical to a sequence 

5 as described in Tables 2A-80; or 

6 iii) detecting a polypeptide encoded by said nnicleic acid; or 

7 b) said patient is: 

8 i) undergoing a therapeutic regimen to treat said pathology of Table 1 ; or 

9 ii) issuspectedof having said pathology or cancer. 

1 8. An isolated nucleic acid molecule comprising a sequence as described in 

2 Tables 2A-80. 
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The nucleic acid molecule of Claim 8, which is labeled- 

An expression vector comprising the nucleic add of Claim 8. 

A host cell comprising the expression vector of Claim 10. 



2 comprising a sequence as described in Tables 2A-80, 

1 13. An antibody that specifically binds a polypeptide of Claim 12. 

1 14, The antibody of Claim 13: 

2 a) conjugated to an effector component; 

3 b) conjugated to a detectable label, including a fluoresceat label, a radioisotope, or a 

4 cytotoxic chemical; 

5 c) which is an antibody firagmetit; or 

6 d) which is a humanized antibody. 

1 15. A method for specifically targeting a compound to a pathological cell in a 

2 patient, said method comprising administering to said patient an antibody of Claim 13, 

3 therd)y providing said targetting. 

1 16- A method for determining the presence or absmce of a pattiological cell in a 

2 patient, said method comprising contacting a biological sample with an antibody 13, 

1 17. The method ofQaim 16, wherein: 

2 a) said antibody is c<>Djugated to: 

3 i) an efTector component; or 

4 ii) a fluorescent label; or 

5 b) said biological sample is a blood, serum, urine, or stool sample. 

1 1 8, A mefliod for identifying a compound that modulates a pathology-associated 

2 polypeptide, said method comprising the steps of: 
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3 a) contacting said compound with a pathology-associated polypeptide, said 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence 

5 at least 80% identical to a sequence as described Iq Tables 2A-80; and 

6 b) detennining the fiinctional effect of said compound upon said polypeptide. 

1 19. A drug screening assay comprising the steps of: 

2 a) administering a test compound to a mammal having a pathology of Table 1 or a 

3 cell isolated therefix)m; and 

4 b) comparing the level of gene expression of a polynucleotide that selectively 

5 hybridizes to a sequence at least 80% identical to a sequence as described in 

6 Tables 2A-80 ia a treated cell or mammal with Ibe level of gene expression of said 

7 polynucleotide in a control cell or mammal, wherein a test compound that 

8 modulates said level of expression of the polynucleotide is a candidate for the 

9 treatment of said pathology. 
10 
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Box n Observations where unity of invention is lacking (Continuation of Item 2 of first sheet) 
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^- I I As all searchable claims could be searched without effort justifying an additional fee, tliis Authority did ut^t invite 
payment of any additional fee. 

3. I I As only some of the required additional search fees were timely paid by the applicant, lliis intemationai search 
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BOX n. OBSERVATIONS ^^VHERE UNITY OF im^NTION IS LACKING 

This application contains the foEowitig inventions or groups of inventions which are not so linked as to form a single general 
inventive concept under PCT Rule 13. 1 . In order for all inventions to be examined, the appropriate additional exammation fees must 
be paid. 

Groiq) I, claitnCs) 1-7, drawn to a special technical feature of a metliod for determining presence or absence of a pathological cell in a 
patient, said method coinprising detecting a nucleic acid coroprising a sequence at least 80% identical to a sequence as described in 
Tables 2A-80 in a biological saii5>le from said patient, thereby determining the presence or absence of said pathological cell. 

Gioop II, claim(s) 8-11, drawn to a special technical feature of an isolated nucleic acid molecule comprisuig a sequence as described 
in Tables 2A-80, expression vector comprising the nucleic acid and a host cell coinprising the expression vector. 

Group in, claim(s) 12, drawn to a special technical feature of an isolated polypeptide which is encoded by an isolated nucleic acid 
molecule conq;)iising a seqaence as described in Tables 2A-80. 

Group IV, ciaim(s) 13, 14, drawn to a special technical feature of an antibody which specifically binds to polypeptide of claim 12. 

Group V, claim(s) 15, drawn to a special technical feature of a method for specifically targeting a compound to a patliological cell in 
a patient^ coirprising administering to a patient an antibody of claim 13 . 

Group VI, claim(s) 16, 17, drawn to a special technical feature of a method for determining the presence or absence of a pathological 
cell in a patient, coir^irising contacting a biological san[5)ie with an antibody of claim 13. 

Group VII, claim(s) 18, drawn to a special teciinical feature of a metliod for identiiyiiig a compound that modulates a palhology- 
associated polypeptide by contacting the compound with a patliology-associated polypeptide encoded by a polynucleotide which 
selectively hybridizes to a sequence at least 80% identical to a sequence described in Tables 2A-80 and determining tlie ftmctional 
effect of the coicpound on the polypeptide. 

Group Vin, ciaim(s) 19, drawn to a special technical feature of a drug screening assay comprising the steps of: administering a test 
corcpound to a mammal having pathology of Table 1 or a cell isolated therefrom; comparing the level of gene expression of a 
polynucleotide which selectively hybridizes to a sequence at least 80% identical to a sequence described in Tables 2A-80 in a treated 
cell or mammal with the level of gene expression of the polynucleotide in a control cell or mammal. 

The inventions listed as Groups I-VIII do not relate to a single general inventive concept under PCT Rule 13. 1 because, under PCT 
Rule 13.2, they lack the same or corresponding special technical features for the following reasons: claim 8 is anticipated by a 
sequence with accession No. BE440042 (Table 2A, first entry) (July 25, 2000), therefore there is no contribution of claim 8 over 
prior art. 
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